* feat: Cache raw YAML parsing results of files

It is expensive to read and parse an YAML file. This commit optimizes
for usecases where YAML files are reused across any input by caching the
raw parsing result.

* Pin setuptools version to 69.5.1 for Python >= 3.8
A change introduced in setuptools version 70.0.0 causes tests under
test_moduleregistry.py to fail with "InvalidVersion". Pin the setuptools
version for Python >= 3.8 until this is investigated and resolved.

Change-Id: I534d7a6d9992780cd8de4f8460190caf5c3bde12
This commit is contained in:
Ankush Pathak 2024-07-03 11:33:03 -05:00
parent 0c012177d3
commit ed2c37304c
2 changed files with 10 additions and 2 deletions

@ -9,7 +9,7 @@
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations # License for the specific language governing permissions and limitations
# under the License. # under the License.
import functools
import io import io
import logging import logging
import warnings import warnings
@ -58,7 +58,14 @@ class Loader(LocLoader):
def load_fp(self, fp): def load_fp(self, fp):
return self.load(fp) return self.load(fp)
# 1024 cache size seems like a reasonable balance between memory requirement and performance gain
@functools.lru_cache(maxsize=1024)
def load_path(self, path): def load_path(self, path):
# Caching parsed file outputs is safe even with _retain_anchors set to True because:
# PyYAML does not allow updating anchor values, it is considered as anchor duplication
# So we can safely cache a parsed YAML for a file containing an alias since the alias can be defined
# only once and must be defined before use. The alias value will remain same irrespective of the number
# times a file is parsed
return self.load(path.read_text(), source_path=path, source_dir=path.parent) return self.load(path.read_text(), source_path=path, source_dir=path.parent)
def load(self, stream, source_path=None, source_dir=None): def load(self, stream, source_path=None, source_dir=None):

@ -9,4 +9,5 @@ stevedore>=1.17.1; python_version >= '3.0' # Apache-2.0
python-jenkins>=1.8.2 python-jenkins>=1.8.2
fasteners fasteners
Jinja2 Jinja2
setuptools setuptools==69.5.1; python_version >= '3.8'
setuptools; python_version < '3.8'