Codebuild, Gitlab and Submodules

We like the serverless components of Amazons AWS. One of its interesting features is a build environment which consists of CodeCommit, CodeBuild and CodePipeline. CodeCommit allows you to create Git repositories, CodeBuild executes the build process and CodePipeline is a continuous delivery service. We want to use them to create a custom build server and had the following requirements:

  • Instead of CodeCommit, we have a private GitLab server
  • Automatically execute CodeBuild after every push into our GitLab to provide recent binaries for our customers
  • Support our Git structure which makes use of submodules

Unfortunatly, Codebuild has no native support for private GitLab repositories. Furthermore, its automated checkout ignores submodules. Therefore, we had to put a little bit more effort into our AWS buildserver. The details are covered within this blog article.

Connecting Gitlab to Codebuild

If you create a new CodeBuild project, a source and a destination have to be selected. When a build process is started, CodeBuild reads the source, starts a build script and deploys the compiled artifacts at the destination. You have the following source provider options:

Unfortunatly, GitLab is not available. My first idea was to select a S3 bucket. In this case, a server side Git hook would be necessary to copy the source code after each commit into the S3 bucket. This leads to two problems:

  • Custom Git hooks must be configured on the filesystem of the GitLab server and I do not have shell access 😉
  • A synchronisation mechanism between GitLab and CodeBuild becomes necessary. Otherwise, two simultaneous commits would overwrite the S3 content while its still compiled by CodeBuild

Therefore I came to another solution, created a CodeCommit repository and selected it as source provider. This repository contains a single buildspec.yml file:

version: 0.2

phases:
  install:
    commands:
        - export GIT_ALLOW_PROTOCOL="ssh"
        - git clone git@mygitlab.phobosys.de:Customer/myproject.git
        - cd myproject
  build:
    commands:
        - echo Build started on `date`
        - mvn install

Whenever CodeBuild is triggered, it executes buildspec.yml which clones the project from my GitLab repository and starts maven. This approach allows us to connect a private GitLab to CodeBuild with minimal overhead. Nevertheless, some problems remain unsolved and are discussed in the following sections.

Adding Credentials and Git Submodules

Of course, the previously shown example will not work in many cases because people (including us) deny public access to their repositories. Therefore, we have to provide some kind of login credentials . In our case, we decided to:

  • Add a new AWS GitLab user and generate ssh keys for him
  • Give the previously created user access to the corresponding GitLab projects
  • Store his private ssh key in an S3 bucket
  • Fetch the ssh key during build and add it to an ssh agent to allow git clone

We will not go into details regarding GitLab user management or generation of ssh keys. The new ssh key is stored in a S3 bucket. Upon build, it is copied into the CodeBuild container. To achieve this, the buildspec.yml is modified the following way:

version: 0.2

phases:
  install:
    commands:
        - export OLDDIR=$PWD
        - export GIT_ALLOW_PROTOCOL="ssh"
        - cd
        - if [ ! -d ".ssh" ]; then mkdir .ssh; fi
        - cd .ssh
        - aws s3 cp s3://credentials-bucket/customer/id_rsa .
        - chmod 400 id_rsa
        - aws s3 cp s3://credentials-bucket/customer/id_rsa.pub .
        - aws s3 cp s3://credentials-bucket/customer/known_hosts .
        - eval `ssh-agent`
        - ssh-add id_rsa
        - cd $OLDDIR
        - git clone git@mygitlab.phobosys.de:Customer/myproject.git
        - cd myproject
        - git submodule update --init

As you can see, the install phase was heavily modified. First, we store the current directory in a local variable using $PWD. This is necessary because the layout of the AWS container may change and we want to avoid hardcoded paths. Furthermore, we allow Git to make use of SSH. This may be enabled by default but again: We do not control the AWS docker container. Better save than sorry. 😉

Afterwards, we set up SSH. Besides copying private and public keys into the .ssh directory, we also add known_hosts. It contains the host key of our private Gitlab. You have to make sure its available because otherwise git clone will fail. Finally, the ssh agent is started.

As mentioned above, our project uses submodules. Therefore, we clone our Git repository and checkout the corresponding submodules. Do not forget to provide access to your Gitlab user for the corresponding projects.

Automated Execution upon Git Push

We have successfully connected our private Gitlab to CodeBuild and can make use of submodules as well. Unfortunatly, CodeBuild has to be triggered manually after each git push. Luckily, GitLab provides webhooks which are automatically executed after a predefined Git action. Therefore, this section presents an approach which uses a webhook to automatically call an AWS lambda after every git push operation to start the AWS CodeBuild process.

The GitLab webhook configuration is part of the project settings and looks like this:

It is hidden in Settings/Integrations. You have to specify a trigger and an URL. The URL points to a REST ressource. This ressource is invoked after every push event via HTTP POST. The body of the HTTP request contains information about the repository, the developer, etc. I wrote an AWS Lambda in Python which reads the POST request and starts CodeBuild.

The source code is available here. Let’s dive into it a little bit:

def lambda_handler(event, context):
  req_context = event["requestContext"]
  
  # POST
  if req_context["httpMethod"] == "POST":
    print("Event: {}".format(event['body']))
    body = json.loads(event['body'])
    
    if body['project']['name'] is not None:
      print('Starting a new build ...')
      target = body['project']['name']
      cb = boto3.client('codebuild')
      build = {
        'projectName': target
      }
      print('Starting build for project {0}'.format(build['projectName']))
      try:
        build = cb.start_build(**build)
        print('Successfully launched a new CodeBuild project build!')
        print('Codebuild returned: {}'.format(build))
      except Exception as e:
        print('Codebuild Error: {}'.format(e));

The example checks whether the lambda was called by HTTP POST. Afterwards the JSON body is parsed and converted into a dictionary. We extract the project name from the body and pass it to our boto3 client. Finally, CodeBuild is invoked and the results written into CloudWatch log files.

Conclusion

This article shows you how to integrate a private GitLab into your CodeBuild process. An AWS lambda can be used to trigger CodeBuild after every git push operation. This allows you to provide current precompiled binaries for your customers in a S3 bucket.

You might be interested in …