关于github:git push对于大量仓库来说非常慢

git push is very slow for a huge repo

我遇到了与-相同的问题-git push对于分支机构非常慢
但是那里的答案不适合我的情况。

我正在与一个拥有大量仓库的公司GitHub进行合作。我的过程如下:

1)从主控方拉起

2)创建新分支

3)提交

4)推送分支以创建拉取请求。

当在(4)上推动分支时,它要写入1,000,000个对象,而当我所做的提交仅更改1行时,该对象大约需要3gb。

如果我转到GitHub UI,并从UI创建一个与(2)中名称相同的分支,然后推入该分支,则推入所需的时间不到一秒钟。不用说master和我的分支之间的更改很小(没有添加或删除大文件)。

我该怎么做才能使Git仅推送相关数据,而不推送整个仓库?

在Windows版本2.17.0上的Git


您可以尝试通过以下方式进行相同的推送:

  • 适用于Windows 2.21的Git
  • git config --global pack.sparse true(我在2019年3月在这里介绍了pack.sparse选项)

此选项来自这些补丁,并在提交d5d2e93中实现,其中包括注释:

These improvements will have even larger benefits in the super-
large Windows repository.

您的情况应该很有趣。

请参阅Derrick Stolee的"探索Git推送性能的新领域"

A git push通常显示如下内容:

1
2
3
4
5
6
7
8
9
10
$ git push origin topic
Enumerating objects: 3670, done.
Counting objects: 100% (2369/2369), done.
Delta compression using up to 8 threads
Compressing objects: 100% (546/546), done.
Writing objects: 100% (1378/1378), 468.06 KiB | 7.67 MiB/s, done.
Total 1378 (delta 1109), reused 1096 (delta 832)
remote: Resolving deltas: 100% (1109/1109), completed with 312 local objects.
To https://server.info/fake.git
* [new branch] topic -> topic

"枚举"的意思是:

Git constructs a pack-file that contains the commit you are trying to push, as well as all commits, trees, and blobs (collectively, objects) that the server will need to understand that commit.
It finds a set of commits, trees, and blobs such that every reachable object is either in the set or known to be on the server.

目标是找到正确的"边界"

https://devblogs.microsoft.com/devops/wp-content/uploads/sites/6/2019/05/sparse-push-commit-walk.png

The uninteresting commits that are direct parents of interesting commits form the frontier

Old:

To determine which trees and blobs are interesting, the old algorithm first determined all uninteresting trees and blobs.

Starting at every uninteresting commit in the frontier, recursively walk from its root tree and mark all reachable trees and blobs as uninteresting. This walk skips trees that were already marked as uninteresting to avoid revisiting potentially large portions of the graph.

https://devblogs.microsoft.com/devops/wp-content/uploads/sites/6/2019/05/sparse-push-old-algorithm.png

New

The old algorithm is recursive: it takes a tree and runs the algorithm on all subtrees.

The new algorithm uses the paths to reduce the scope of the tree walk. It is also recursive, but it takes a set of trees.
As we start the algorithm, the set of trees contains the root trees for the uninteresting and the interesting commits.

https://devblogs.microsoft.com/devops/wp-content/uploads/sites/6/2019/05/sparse-push-new-algorithm.png

The new tree walk recursively explores paths containing interesting and uninteresting trees.
Inside the trees at B, we have subtrees with names F and G.
Both sets have interesting and uninteresting paths, so we recurse into each set. This continues into B/F and B/G. The B/F set will not recurse into B/F/M or B/F/N and the B/G set will not recurse into B/G/X but not B/G/Y.


这听起来像是行尾问题。

如果您在Windows计算机上签出回购协议,则Unix(LF)行尾将转换为Windows(CR LF)。
提交时,Git会认为所有文件都已更新,因为所有行尾都已更改。

您可以使用以下命令配置Git来为您管理此操作:

git config --global core.autocrlf true