Shallow git fetch regression in discourse_docker

Bonjour,

A recent commit in docker_discourse breaks the ability to specify a tag (for instance v2.6.0) in the version: value of app.yml. This is useful to install older versions of discourse, for test purposes.

The problem shows like this when specifying version: v2.6.0 when using e2eb085714dfcf2aa0117b0f2fdf39b762b0e18d

I, [2020-12-05T10:59:38.848743 #1]  INFO -- : > cd /var/www/discourse && git remote set-branches origin v2.6.0
I, [2020-12-05T10:59:38.852600 #1]  INFO -- : 
I, [2020-12-05T10:59:38.852639 #1]  INFO -- : > cd /var/www/discourse && git fetch --depth 1 origin v2.6.0
From https://github.com/discourse/discourse
 * tag                 v2.6.0     -> FETCH_HEAD
I, [2020-12-05T10:59:41.405163 #1]  INFO -- : 
I, [2020-12-05T10:59:41.405307 #1]  INFO -- : > cd /var/www/discourse && git checkout v2.6.0
error: pathspec 'v2.6.0' did not match any file(s) known to git
I, [2020-12-05T10:59:41.411796 #1]  INFO -- : 

Instead of the expected output when using the commit just before that one.

I, [2020-12-05T11:22:14.717910 #1]  INFO -- : > cd /var/www/discourse && git fetch origin v2.6.0
From https://github.com/discourse/discourse
 * tag                     v2.6.0     -> FETCH_HEAD
I, [2020-12-05T11:22:15.672616 #1]  INFO -- : 
I, [2020-12-05T11:22:15.672683 #1]  INFO -- : > cd /var/www/discourse && git checkout v2.6.0
Note: checking out 'v2.6.0'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at d6121249d3 Version bump to v2.6.0
6 Likes

I would not call this a bug per-se, but we want to offer some sort of instructions on how to get ancient versions of discourse installed even if it involves a more complicated template or something.

@Falco is investigating.

7 Likes

I agree the depth should be configurable, perhaps via an environmental variable, defaulting to “shallow”; but configurable with a simple environmental.

1 Like

I think the problem here is that the tags are not known locally when doing the checkout, like in the following (non discourse related) issue:

Running git fetch --all should solve the problem, but I don’t know how much it could increase the image size (unless another instruction clears the unused references later on).

That said, I think git clone --depth 1 https://github.com/discourse/discourse.git --branch=$version would solve it, because branch allows both branches and tags, but I haven’t tested and I don’t know if there is a reason for the clone to use (currently) the master branch.

Doing git clone --depth 1 https://github.com/discourse/discourse.git --branch=v2.6.0, the entire folder size is 212MB, and the .git folder inside has 46MB, so I think it’s fine.

That doubles the repository size :slightly_frowning_face:

Problem is that at the image build time I don’t know what branch you will want in the future.

The current setup was changed to make the image size smaller, and it made the image compressed size 250MB (25%) smaller, which is a huge win. It works fine when using normal branches such as stable and beta or tests-passed.

As a work-around, if you want to switch to a tag you can apply this to your app.yml file:

hooks:
  after_code:
    - exec:
        cd: $home/plugins
        cmd:
          - git clone https://github.com/discourse/docker_manager.git
+    - exec:
+        cd: $home
+        cmd:
+          - git fetch --depth=1 origin tag v2.5.0 --no-tags
+          - git checkout v2.5.0

Another work-around is adding a base_image key to the top level of the app.yml with the value of an older base image. Since we don’t even try to keep compatibility of the new images being able to run older Discourse versions this may be necessary if you are going back long enough.

8 Likes

You’re right, at that time we don’t know the version, it seems the base image uses the current version + tests-passed branch, although the branch will have the commit at the time in which the image was build.

Wouldn’t the way it’s doing now have slower rebuilds, even when tests-passed branch is used?

Just consider the following instructions:

In the base image:

git clone --depth 1 https://github.com/discourse/discourse.git
cd discourse/
git remote set-branches --add origin tests-passed

In web.template.yml

git reset --hard
git clean -f
git remote set-branches --add origin master
git pull
...
When `git pull` is called, **the entire repository is pulled**, and can take several minutes, because only a shallow clone was done before. You can try running just the above instructions locally and see. Not saying that having the entire repository in the base image is better, but the code in `web.template.yml` will run on every rebuild, even if only a plugin was added or a setting was changed in `app.yml`. What I normally do in my (non-discourse) projects is to make a new image for every new version, but that may not be feasible to you (considering how you do it currently). Haven't you perceived some increase in the rebuild time? (or maybe that is not so big compared to the total rebuild time, in most cases)

Update

I tested the above steps again and they were fast. I guess I run another instruction on the 1st try that changed the git tree and ended up trying to pull everything when I run git pull.

2 Likes

Are you sure about that?

➜  discoursesmall git:(6a42acbf) docker run --rm -it discourse/base:2.0.20201125-2246
root@b481d11669ba:/# cd /var/www/discourse/
root@b481d11669ba:/var/www/discourse# du -sh .                                                     
774M    . 
root@b481d11669ba:/var/www/discourse# git pull
...
root@b481d11669ba:/var/www/discourse# du -sh .                                                                
778M    . 
5 Likes

@lucasbasquerotto you are correct to point out that the git pull is not strictly necessary though, we’ve removed it here

This should allow other branches (or forks) to play a little nicer with discourse_docker moving forward :slight_smile:

5 Likes

Yes, I see that it does a fetch and then a checkout in the correct branch after the pull, so I think the git pull is not necessary (I haven’t tested tough).

For tags it seems that I still need to fetch the tag separately, but it seems doable, furthermore the branches are more commonly used, so the tags should be more like an edge case.

1 Like

Is it safe to assume that the version option in configuration file standalone.yml has no effect?

It still has an effect, but it can only be set to branches now.

3 Likes

I am getting the same error. I was using version 2.5.1.

Upon this, I’m getting the following error:

I, [2020-12-31T11:50:24.701475 #1]  INFO -- : > cd /var/www/discourse && find /var/www/discourse ! -user discourse -exec chown discourse {} \+
chown: cannot dereference '/var/www/discourse/public/plugins/styleguide': No such file or directory

The rebuild won’t work. any help?

Try adding the mentioned key and using an older image from Docker Hub

2 Likes

This does not work because it happens after the code that fails because it cannot retrieve the version. What worked was to create a version.template.yml next to web.template.yml with the following content:

params:
  home: /var/www/discourse

run:
  - exec:
      cd: $home
      hook: code
      cmd:
        - git fetch --tags

And then include this file in containers/app.yml, before web.template.yml like so:

templates:
  - "templates/postgres.template.yml"
  - "templates/redis.template.yml"
  - "templates/version.template.yml"
  - "templates/web.template.yml"

For that to work you shouldn’t use the version top level key on your app.yml, just that new code. Doing that it doesn’t fail.

3 Likes

Thanks for clarifying: that’s exactly the part I was missing. For those who are confused in the same way I was, getting a release tag of discourse can be done by:

  • Making sure the version parameter is not set in app.yml, for instance:
    params:
      db_default_text_search_config: "pg_catalog.english"
      #  version: stable
    
  • Adding code to checkout the desired version towards the end of app.yml, for instance:
    hooks:
      after_code:
        - exec:
            cd: $home/plugins
            cmd:
              - git clone https://github.com/discourse/docker_manager.git
    +    - exec:
    +        cd: $home
    +        cmd:
    +          - git fetch --depth=1 origin tag v2.5.0 --no-tags
    +          - git checkout v2.5.0
    

When running ./launcher rebuild app here is what happens:

  • The default version (i.e. the test_passed branch) is checked out.
  • The v2.5.0 tag is fetched and checked out, effectively replacing the previous version
1 Like