重写历史git filter-branch创建/拆分为子模块/子项目

Rewrite history git filter-branch create / split into submodules / subprojects

我目前正在将一个cvs项目导入git中。
导入后,我想重写历史记录,将现有目录移动到单独的子模块中。

假设我有这样一个结构:


file1
file2
file3
dir1
dir2
library

现在,我想重写历史,使目录library始终是git子模块。例如,将指定的目录拆分为各自的子模块/子项目

这是我当前的代码:

文件重写子模块(称为)

1
2
cd project
git filter-branch --tree-filter $PWD/../$0-tree-filter --tag-name-filter cat -- --all

文件重写子模块树筛选器

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
    #!/bin/bash

    function gitCommit()
    {
        unset GIT_DIR
        unset GIT_WORK_TREE
        git add -A
        if [ -n"$(git diff --cached --name-only)" ]
        then
            # something to commit
            git commit -F $_msg
        fi
    }

    _git_dir=$GIT_DIR
    _git_work_tree=$GIT_WORK_TREE
    unset GIT_DIR
    unset GIT_WORK_TREE
    _dir=$PWD

    if [ -d"library" ]
    then
        _msg=$(tempfile)
        git log ${GIT_COMMIT}^! --format="%B"> $_msg
        git rm -r --cached lib
        cd library
        if [ -d".git" ]
        then
            gitCommit
        else
            git init
            gitCommit
        fi
        cd ..
        export GIT_DIR=$_git_dir
        export GIT_WORK_TREE=$_git_work_tree
        git submodule add -f ./lib
    fi

    GIT_DIR=$_git_dir
    GIT_WORK_TREE=$_git_work_tree

此代码创建.gitmodules文件,但不创建主存储库中的子模块提交条目(行Subproject commit ,由git diff输出),目录library中的文件仍在主存储库中进行版本控制,而不在子项目存储库中。

事先谢谢你的任何提示

.gitmodules如下所示:

1
2
3
    [submodule"library"]
        path = library
        url = ./library


我解决了自己的问题,解决方法如下:

埃多克斯1〔3〕

脚本git-submodule-split

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
    #!/bin/bash

    set -eu

    if [ $# -eq 0 ]
    then
        echo"Usage: $0 submodules-to-split"
    fi

    export _tmp=$(mktemp -d)
    export _libs="$@"
    for i in $_libs
    do
        mkdir -p $_tmp/$i
    done

    git filter-branch --commit-filter '
    function gitCommit()
    {
        git add -A
        if [ -n"$(git diff --cached --name-only)" ]
        then
            git commit -F $_msg
        fi
    } >/dev/null

    # from git-filter-branch
    git checkout-index -f -u -a || die"Could not checkout the index"
    # files that $commit removed are now still in the working tree;
    # remove them, else they would be added again
    git clean -d -q -f -x

    _git_dir=$GIT_DIR
    _git_work_tree=$GIT_WORK_TREE
    _git_index_file=$GIT_INDEX_FILE
    unset GIT_DIR
    unset GIT_WORK_TREE
    unset GIT_INDEX_FILE

    _msg=$(tempfile)
    cat /dev/stdin > $_msg
    for i in $_libs
    do
        if [ -d"$i" ]
        then
            unset GIT_DIR
            unset GIT_WORK_TREE
            unset GIT_INDEX_FILE
            cd $i
            if [ -d".git" ]
            then
                gitCommit
            else
                git init >/dev/null
                gitCommit
            fi
            cd ..
            rsync -a -rtu $i/.git/ $_tmp/$i/.git/
            export GIT_DIR=$_git_dir
            export GIT_WORK_TREE=$_git_work_tree
            export GIT_INDEX_FILE=$_git_index_file
            git rm -q -r --cached $i
            git submodule add ./$i >/dev/null
            git add $i
        fi
    done
    rm $_msg
    export GIT_DIR=$_git_dir
    export GIT_WORK_TREE=$_git_work_tree
    export GIT_INDEX_FILE=$_git_index_file

    if [ -f".gitmodules" ]
    then
        git add .gitmodules
    fi

    _new_rev=$(git write-tree)
    shift
    git commit-tree"$_new_rev""$@";
    ' --tag-name-filter cat -- --all

    for i in $_libs
    do
        if [ -d"$_tmp/$i/.git" ]
        then
            rsync -a -i -rtu $_tmp/$i/.git/ $i/.git/
            cd $i
            git reset --hard
            cd ..
        fi
    done
    rm -r $_tmp

    git for-each-ref refs/original --format="%(refname)" | while read i; do git update-ref -d $i; done

    git reflog expire --expire=now --all
    git gc --aggressive --prune=now


我有一个utils库的项目,它开始在其他项目中有用,并希望把它的历史分为一个子模块。没想到这么先看,所以我自己写了,它在本地建立了历史,所以速度快了一点,之后如果你想的话,你可以设置helper命令的.gitmodules文件等等,然后把子模块历史推到你想要的任何地方。

剥离命令本身就在这里,文档在注释中,在后面的未剥离命令中。把它作为自己的命令运行,设置subdir,就像在分割utils目录时设置subdir=utils git split-submodule。它是黑客的,因为它是一次性的,但是我在git历史的documentation子目录中对它进行了测试。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#!/bin/bash
# put this or the commented version below in e.g. ~/bin/git-split-submodule
${GIT_COMMIT-exec git filter-branch --index-filter"subdir=$subdir; ${debug+debug=$debug;} $(sed 1,/SNIP/d"$0")""$@"}
${debug+set -x}
fam=(`git rev-list --no-walk --parents $GIT_COMMIT`)
pathcheck=(`printf"%s:$subdir\
" ${fam[@]} \
    | git cat-file --batch-check='%(objectname)' | uniq`)
[[ $pathcheck = *:* ]] || {
    subfam=($( set -- ${fam[@]}; shift;
        for par; do tpar=`map $par`; [[ $tpar != $par ]] &&
            git rev-parse -q --verify $tpar:"$subdir"
        done
    ))
    git rm -rq --cached --ignore-unmatch "$subdir"
    if (( ${#pathcheck[@]} == 1 && ${#fam[@]} > 1 && ${#subfam[@]} > 0)); then
        git update-index --add --cacheinfo 160000,$subfam,"$subdir"
    else
        subnew=`git cat-file -p $GIT_COMMIT | sed 1,/^$/d \
            | git commit-tree $GIT_COMMIT:"$subdir" $(
                ${subfam:+printf ' -p %s' ${subfam[@]}}) 2>&-
            ` &&
        git update-index --add --cacheinfo 160000,$subnew,"$subdir"
    fi
}
${debug+set +x}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
#!/bin/bash
# Git filter-branch to split a subdirectory into a submodule history.

# In each commit, the subdirectory tree is replaced in the index with an
# appropriate submodule commit.
# * If the subdirectory tree has changed from any parent, or there are
#   no parents, a new submodule commit is made for the subdirectory (with
#   the current commit's message, which should presumably say something
#   about the change). The new submodule commit's parents are the
#   submodule commits in any rewrites of the current commit's parents.
# * Otherwise, the submodule commit is copied from a parent.

# Since the new history includes references to the new submodule
# history, the new submodule history isn't dangling, it's incorporated.
# Branches for any part of it can be made casually and pushed into any
# other repo as desired, so hooking up the `git submodule` helper
# command's conveniences is easy, e.g.
#     subdir=utils git split-submodule master
#     git branch utils $(git rev-parse master:utils)
#     git clone -sb utils . ../utilsrepo
# and you can then submodule add from there in other repos, but really,
# for small utility libraries and such, just fetching the submodule
# histories into your own repo is easiest. Setup on cloning a
# project using"incorporated" submodules like this is:
#   setup:  utils/.git
#
#   utils/.git:
#       @if _=`git rev-parse -q --verify utils`; then \
#           git config submodule.utils.active true \
#           && git config submodule.utils.url"`pwd -P`" \
#           && git clone -s . utils -nb utils \
#           && git submodule absorbgitdirs utils \
#           && git -C utils checkout $$(git rev-parse :utils); \
#       fi
# with `git config -f .gitmodules submodule.utils.path utils` and
# `git config -f .gitmodules submodule.utils.url ./`; cloners don't
# have to do anything but `make setup`, and `setup` should be a prereq
# on most things anyway.

# You can test that a commit and its rewrite put the same tree in the
# same place with this function:
# testit ()
# {
#     tree=($(git rev-parse `git rev-parse $1`: refs/original/refs/heads/$1));
#     echo $tree `test $tree != ${tree[1]} && echo ${tree[1]}`
# }
# so e.g. `testit make~95^2:t` will print the `t` tree there and if
# the `t` tree at ~95^2 from the original differs it'll print that too.

# To run it, say `subdir=path/to/it git split-submodule` with whatever
# filter-branch args you want.

# $GIT_COMMIT is set if we're already in filter-branch, if not, get there:
${GIT_COMMIT-exec git filter-branch --index-filter"subdir=$subdir; ${debug+debug=$debug;} $(sed 1,/SNIP/d"$0")""$@"}

${debug+set -x}
fam=(`git rev-list --no-walk --parents $GIT_COMMIT`)
pathcheck=(`printf"%s:$subdir\
" ${fam[@]} \
    | git cat-file --batch-check='%(objectname)' | uniq`)

[[ $pathcheck = *:* ]] || {
    subfam=($( set -- ${fam[@]}; shift;
        for par; do tpar=`map $par`; [[ $tpar != $par ]] &&
            git rev-parse -q --verify $tpar:"$subdir"
        done
    ))

    git rm -rq --cached --ignore-unmatch "$subdir"
    if (( ${#pathcheck[@]} == 1 && ${#fam[@]} > 1 && ${#subfam[@]} > 0)); then
        # one id same for all entries, copy mapped mom's submod commit
        git update-index --add --cacheinfo 160000,$subfam,"$subdir"
    else
        # no mapped parents or something changed somewhere, make new
        # submod commit for current subdir content.  The new submod
        # commit has all mapped parents' submodule commits as parents:
        subnew=`git cat-file -p $GIT_COMMIT | sed 1,/^$/d \
            | git commit-tree $GIT_COMMIT:"$subdir" $(
                ${subfam:+printf ' -p %s' ${subfam[@]}}) 2>&-
            ` &&
        git update-index --add --cacheinfo 160000,$subnew,"$subdir"
    fi
}
${debug+set +x}


这里有一个最新的答案,在Macosx上对我有效。主要的更改是使用pushd/popd来更改目录,这样子模块可以是类似于module/glop的东西,而不仅仅是glop。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
#!/bin/bash

set -eu

if [ $# -eq 0 ]
then
    echo"Usage: $0 submodules-to-split"
fi

export _tmp=$(mktemp -d /tmp/git-submodule-split.XXXXXX)
export _libs="$@"
for i in $_libs
do
    mkdir -p $_tmp/$i
done

git filter-branch --commit-filter '
function gitCommit()
{
    git add -A
    if [ -n"$(git diff --cached --name-only)" ]
    then
        git commit -F $_msg
    fi
} >/dev/null

# from git-filter-branch
git checkout-index -f -u -a || die"Could not checkout the index"
# files that $commit removed are now still in the working tree;
# remove them, else they would be added again
git clean -d -q -f -x >&2

_git_dir=$GIT_DIR
_git_work_tree=$GIT_WORK_TREE
_git_index_file=$GIT_INDEX_FILE
unset GIT_DIR
unset GIT_WORK_TREE
unset GIT_INDEX_FILE

_msg=$(mktemp /tmp/git-submodule-split-msg.XXXXXX)
cat /dev/stdin > $_msg
for i in $_libs
do
    if [ -d"$i" ]
    then
        unset GIT_DIR
        unset GIT_WORK_TREE
        unset GIT_INDEX_FILE
        pushd $i > /dev/null
        if [ -d".git" ]
        then
            gitCommit
        else
            git init >/dev/null
            gitCommit
        fi
        popd > /dev/null
        mkdir -p $_tmp/$i
        rsync -a -rtu $i/.git/ $_tmp/$i/.git/
        export GIT_DIR=$_git_dir
        export GIT_WORK_TREE=$_git_work_tree
        export GIT_INDEX_FILE=$_git_index_file
        git rm -q -r --cached $i >&2
        git submodule add ./$i $i >&2
        git add $i >&2
    fi
done
export GIT_DIR=$_git_dir
export GIT_WORK_TREE=$_git_work_tree
export GIT_INDEX_FILE=$_git_index_file

if [ -f".gitmodules" ]
then
    git add .gitmodules >&2
fi

_new_rev=$(git write-tree)
shift
git commit-tree -F $_msg"$_new_rev" $@;
rm -f $_msg
' --tag-name-filter cat -- --all

for i in $_libs
do
    if [ -d"$_tmp/$i/.git" ]
    then
        rsync -a -i -rtu $_tmp/$i/.git/ $i/.git/
        pushd $i
        git reset --hard
        popd
    fi
done
rm -rf $_tmp

git for-each-ref refs/original --format="%(refname)" | while read i; do git update-ref -d $i; done

git reflog expire --expire=now --all
git gc --aggressive --prune=now

注意:只有当您从父repo a创建子模块条目时,才会创建子模块条目。

1
2
git submodule init
git submodule update

你不需要在你的rewrite-submodule-tree-filter脚本中使用这些命令,因为它只是关于正确设置.gitmodules文件内容。

只有在第一次使用父repo时,才能执行这些"git submodule命令:请参见"用子模块克隆项目"。