关于git：“下游”和“上游”的定义

Definition of “downstream” and “upstream”

我开始和Git打交道，遇到了"上游"和"下游"两个词。我以前见过这些，但从来没有完全理解过。这些术语在SCMS(软件配置管理工具)和源代码的上下文中意味着什么？

在源代码管理方面，当您从存储库复制(克隆、签出等)时，您处于"下游"。信息"顺流而下"流向你。

当您进行更改时，您通常希望将它们"向上游"发送，以便它们将其发送到存储库中，以便从同一源提取的每个人都处理所有相同的更改。这主要是一个关于如何协调每个人的工作的社会问题，而不是源代码管理的技术要求。您希望将您的更改引入到主项目中，这样就不会跟踪不同的开发线。

有时，您会阅读关于包或发布管理器(人员，而不是工具)的内容，讨论如何向"上游"提交更改。这通常意味着他们必须调整原始源，以便为他们的系统创建一个包。他们不想继续进行这些更改，所以如果他们将这些更改"上游"发送到原始源，就不必在下一个版本中处理相同的问题。

相关讨论

当您在git tag手册页中阅读时：

One important aspect of git is it is distributed, and being distributed largely means there is no inherent"upstream" or"downstream" in the system.

也就是说，没有绝对的上游回购或下游回购。这些概念始终是两个回购协议之间的相对概念，并取决于数据流动的方式：

如果"yourrepo"已将"otherrepo"声明为远程，则：

您正在从上游"Otherrepo"("Otherrepo"是"从您上游"，而您是"为Otherrepo下游")拉动。
您正在向上游推进("OtherRepo"仍然是"上游"，现在信息将返回到那里)。

注意"从"和"为"：你不仅是"下游"，你是"从/为下游"，因此是相对的方面。

DVCS(分布式版本控制系统)的转折点是：除了您自己的repo(相对于您声明的远程repo)，您不知道下游实际上是什么。

你知道上游是什么(你从或推动的回购)
你不知道下游是由什么构成的(其他回购协议从你的回购协议中拉出来或推到你的回购协议中)。

基本上：

就"数据流"而言，您的回购位于来自上游回购("拉自")和返回(相同或其他)上游回购("推至")的流的底部("下游")。

您可以在git-rebase手册页上看到一个说明，其中包含"从上游重新平衡恢复"一段：

这意味着您正从"上游"回购中退出，而您("下游"回购)则陷入了结果(许多重复提交，因为上游重新调整的分支重新创建了您在本地拥有的同一分支的提交)。

这是不好的，因为对于一个"上游"回购，可能会有许多下游回购(即从上游回购拉过来的回购，带有重新调整的分支)，所有这些回购都有可能处理重复提交。

同样，使用"数据流"类比，在DVC中，一个错误的命令"上游"可能会对下游产生"涟漪效应"。

注：这不限于数据。它也适用于参数，因为git命令(如"陶瓷"命令)通常在内部调用其他git命令(即"管道"命令)。见rev-parse手册页：

Many git porcelainish commands take mixture of flags (i.e. parameters that begin with a dash '-') and parameters meant for the underlying git rev-list command they use internally and flags and parameters for the other commands they use downstream of git rev-list. This command is used to distinguish between them.

相关讨论

上游(与之相关)跟踪

术语upstream在git工具套件中也有一些明确的含义，特别是与跟踪相关的含义。

例如：

1
2
$git rev-list --count --left-right"@{upstream}"...HEAD
>4 12

will print (the last cached value of) the number of commits behind (left) and ahead (right) of your current working branch, relative to the (if any) currently tracking remote branch for this local branch. It will print an error message otherwise:

1
>error: No upstream branch found for ''

如前所述，对于一个本地存储库，您可能拥有任意数量的远程设备，例如，如果您从GitHub分叉一个存储库，然后发出一个"拉请求"，那么您肯定至少有两个：origin(您在GitHub上分叉的回购)和EDOCX1(您在GitHub上分叉的回购)。这些只是可互换的名称，只有"git@…"URL标识它们。

Your .git/configreads :

1
2
3
4
5
6
[remote"origin"]
fetch = +refs/heads/*:refs/remotes/origin/*
url = [email protected]:myusername/reponame.git
[remote"upstream"]
fetch = +refs/heads/*:refs/remotes/upstream/*
url = [email protected]:authorname/reponame.git

另一方面，@上游对git的意义是独一无二的：

it is 'the branch' (if any) on 'said remote', which is tracking the 'current branch' on your 'local repository'.

It's the branch you fetch/pull from whenever you issue a plain git fetch/git pull, without arguments.

假设希望将远程分支源站/主服务器设置为已签出的本地主服务器分支的跟踪分支。刚刚发布：

1
2
$ git branch --set-upstream master origin/master
> Branch master set up to track remote branch master from origin.

This adds 2 parameters in .git/config :

1
2
3
[branch"master"]
remote = origin
merge = refs/heads/master

now try (provided 'upstream' remote has a 'dev' branch)

1
2
$ git branch --set-upstream master upstream/dev
> Branch master set up to track remote branch dev from upstream.

.git/config now reads:

1
2
3
[branch"master"]
remote = upstream
merge = refs/heads/dev

git-push(1) Manual Page :

1
2
-u
--set-upstream

For every branch that is up to date or successfully pushed, add upstream (tracking) reference, used by argument-less git-pull(1) and other commands. For more information, see branch..merge in git-config(1).

git-config(1) Manual Page :

1
branch.<name>.merge

Defines, together with branch..remote, the upstream branch for the given branch. It tells git fetch/git pull/git rebase which branch to merge and can also affect git push (see push.default).
\
(...)

1
branch.<name>.remote

When in branch < name >, it tells git fetch and git push which remote to fetch from/push to. It defaults to origin if no remote is configured. origin is also used if you are not on any branch.

上游和推进(Gotcha)

看一下git-config(1)手册页

1
2
git config --global push.default upstream
git config --global push.default tracking (deprecated)

This is to prevent accidental pushes to branches which you’re not ready to push yet.

相关讨论

这是一个非正式的术语。

就Git而言，其他所有存储库都只是一个远程存储库。

一般来说，上游是克隆自(源)的地方。下游是将您的工作与其他工作集成在一起的任何项目。

这些术语不限于Git存储库。

例如，Ubuntu是一个Debian衍生工具，所以Debian是Ubuntu的上游。

上游称为有害

唉，这里还有另一种"上游"的用法，其他的答案都没有得到，即指回购中承诺的父子关系。ScottChacon在《支持Git》一书中特别容易出现这种情况，结果令人遗憾。不要模仿这种说话方式。

例如，他说一个合并导致了一个快速前进，这是因为

the commit pointed to by the branch you merged in was directly
upstream of the commit you’re on

他想说commit b是…对于commit a的唯一子代，因此将b合并到a中就足以将ref a移动到commit b的点。为什么这个方向应该称为"上游"而不是"下游"，或者为什么这样一个纯直线图的几何图形应该描述为"直接上游"，这是完全不清楚的，可能是任意的。(git-merge的手册在解释这种关系时做得更好，因为它说"当前的分支负责人是命名委托的祖先。"这是Chacon应该说的那种话。)

实际上，当Chacon谈到重写所有已删除提交的子提交时，他自己似乎在后面使用"downstream"来表示完全相同的事情：

You must rewrite all the commits downstream from 6df76 to fully remove
this file from your Git history

基本上，当提到一段时间以来的承诺历史时，他似乎不清楚自己所说的"上游"和"下游"是什么意思。那么，这种用法是非正式的，不值得鼓励，因为它只是令人困惑。

很明显，每一个承诺(除了一个)至少有一个父母，因此父母是祖先；而在另一个方向上，承诺有子女和后代。这是公认的术语，并明确地描述了图形的方向性，所以当您想要描述提交如何在repo的图形几何中相互关联时，可以这样说。在这种情况下，不要随意使用"上游"或"下游"。

[附加说明：我一直在考虑我上面提到的第一个chacon句子与git-merge手册页之间的关系，我发现前者可能是基于对后者的误解。手册页确实描述了一种使用"上游"是合法的情况：快速转发通常发生在"你正在跟踪一个上游存储库，你没有承诺任何本地更改，现在你想更新到一个新的上游修订"时。因此，也许Chacon使用"上游"是因为他在手册页中看到了它。但是在手册页中有一个远程存储库；Chacon引用的快速转发示例中没有远程存储库，只有几个本地创建的分支。]

相关讨论