* feat: Cache raw YAML parsing results of files

It is expensive to read and parse an YAML file. This commit optimizes
for usecases where YAML files are reused across any input by caching the
raw parsing result.

* Pin setuptools version to 69.5.1 for Python >= 3.8
A change introduced in setuptools version 70.0.0 causes tests under
test_moduleregistry.py to fail with "InvalidVersion". Pin the setuptools
version for Python >= 3.8 until this is investigated and resolved.

Change-Id: I534d7a6d9992780cd8de4f8460190caf5c3bde12
This commit is contained in:
Ankush Pathak 2024-07-03 11:33:03 -05:00
parent 0c012177d3
commit ed2c37304c
2 changed files with 10 additions and 2 deletions

View File

@ -9,7 +9,7 @@
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
import functools
import io
import logging
import warnings
@ -58,7 +58,14 @@ class Loader(LocLoader):
def load_fp(self, fp):
return self.load(fp)
# 1024 cache size seems like a reasonable balance between memory requirement and performance gain
@functools.lru_cache(maxsize=1024)
def load_path(self, path):
# Caching parsed file outputs is safe even with _retain_anchors set to True because:
# PyYAML does not allow updating anchor values, it is considered as anchor duplication
# So we can safely cache a parsed YAML for a file containing an alias since the alias can be defined
# only once and must be defined before use. The alias value will remain same irrespective of the number
# times a file is parsed
return self.load(path.read_text(), source_path=path, source_dir=path.parent)
def load(self, stream, source_path=None, source_dir=None):

View File

@ -9,4 +9,5 @@ stevedore>=1.17.1; python_version >= '3.0' # Apache-2.0
python-jenkins>=1.8.2
fasteners
Jinja2
setuptools
setuptools==69.5.1; python_version >= '3.8'
setuptools; python_version < '3.8'