summaryrefslogtreecommitdiffstats
path: root/project.py
diff options
context:
space:
mode:
authorMike Frysinger <vapier@google.com>2021-12-21 00:40:31 -0500
committerXin Li <delphij@google.com>2022-05-26 00:02:18 +0000
commit1d00a7e2ae64b6c08aff60c2e7ed5c2d89caf8d6 (patch)
tree7804e4ba1a12a0a25bbff7360d69010e64464c72 /project.py
parent3a0a145b0ec52dff88b021c5937925f25294a10f (diff)
downloadgit-repo-1d00a7e2ae64b6c08aff60c2e7ed5c2d89caf8d6.tar.gz
project: initial separation of shared project objects
For now, this is opt-in via environment variables: - export REPO_USE_ALTERNATES=1 The shared project logic that shares the internal .git/objects/ dir directly between multiple projects via the project-objects/ tree has a lot of KI with random corruption. It all boils down to projects sharing objects/ but not refs/. Git operations that use refs to see what objects are reachable and discard the rest can easily discard objects that are used by other projects. Consider this project layout: <show fs layout> There are unique refs in each of these trees that are not visible in the others. This means it's not safe to run basic operations like git prune or git gc. Since we can't share refs (each project needs to have unique refs like HEAD in order to function), let's change how we share objects. The old way involved symlinking .git/objects/ to the project-objects tree. The new way shares objects using git's info/alternates. This means project-objects/ will only contain objects that exist in the remote project. Local per-project objects (like when creating branches and making changes) will never be shared. When running a prune or gc operation in the per-project state, it will only ever repack or discard those per-project objects. The common shared objects would only be cleaned up when running a common operation (i.e. by repo itself). One downside to this for users is if they try blending unrelated upstream projects. For example, in CrOS we have multiple kernel projects (for diff versions) checked out. If a dev fetched the upstream Linus tree into one of them, the objects & tags would not be shared with the others, so they would have to fetch the upstream state for each project. Annoying, but better than the current corruption situation we're in now. Also if the dev runs a manual `git fetch` in the per-project to sync it up to newer state than the last `repo sync` they ran, the objects would get duplicated. However, git operations later on should eventually dedupe this. Bug: https://crbug.com/gerrit/15553 Change-Id: I313a9b8962f9d439ef98ac0ed37ecfb9e0b3864e Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/328101 Reviewed-by: Mike Frysinger <vapier@google.com> Tested-by: LaMont Jones <lamontjones@google.com>
Diffstat (limited to 'project.py')
-rw-r--r--project.py16
1 files changed, 15 insertions, 1 deletions
diff --git a/project.py b/project.py
index b7ed6f33..faa6b32b 100644
--- a/project.py
+++ b/project.py
@@ -49,6 +49,9 @@ MAXIMUM_RETRY_SLEEP_SEC = 3600.0
49# +-10% random jitter is added to each Fetches retry sleep duration. 49# +-10% random jitter is added to each Fetches retry sleep duration.
50RETRY_JITTER_PERCENT = 0.1 50RETRY_JITTER_PERCENT = 0.1
51 51
52# Whether to use alternates.
53# TODO(vapier): Remove knob once behavior is verified.
54_ALTERNATES = os.environ.get('REPO_USE_ALTERNATES') == '1'
52 55
53def _lwrite(path, content): 56def _lwrite(path, content):
54 lock = '%s.lock' % path 57 lock = '%s.lock' % path
@@ -460,7 +463,7 @@ class RemoteSpec(object):
460 463
461class Project(object): 464class Project(object):
462 # These objects can be shared between several working trees. 465 # These objects can be shared between several working trees.
463 shareable_dirs = ['hooks', 'objects', 'rr-cache'] 466 shareable_dirs = ['hooks', 'rr-cache']
464 467
465 def __init__(self, 468 def __init__(self,
466 manifest, 469 manifest,
@@ -1143,6 +1146,17 @@ class Project(object):
1143 self._UpdateHooks(quiet=quiet) 1146 self._UpdateHooks(quiet=quiet)
1144 self._InitRemote() 1147 self._InitRemote()
1145 1148
1149 if _ALTERNATES or self.manifest.is_multimanifest:
1150 # If gitdir/objects is a symlink, migrate it from the old layout.
1151 gitdir_objects = os.path.join(self.gitdir, 'objects')
1152 if platform_utils.islink(gitdir_objects):
1153 platform_utils.remove(gitdir_objects, missing_ok=True)
1154 gitdir_alt = os.path.join(self.gitdir, 'objects/info/alternates')
1155 if not os.path.exists(gitdir_alt):
1156 os.makedirs(os.path.dirname(gitdir_alt), exist_ok=True)
1157 _lwrite(gitdir_alt, os.path.join(
1158 os.path.relpath(self.objdir, gitdir_objects), 'objects') + '\n')
1159
1146 if is_new: 1160 if is_new:
1147 alt = os.path.join(self.objdir, 'objects/info/alternates') 1161 alt = os.path.join(self.objdir, 'objects/info/alternates')
1148 try: 1162 try: