Allison is coding...

Syncing Specific Folders Between Different Repositories Using GitHub Actions

In this article, we’ll explore how to synchronize specific folders between different GitHub repositories using GitHub Actions. This is particularly useful for scenarios where you need to keep a subset of your files in sync across multiple projects.

Problem Statement

We have two repositories:

  1. Repository A: github.com/username/repo-a
    • Contains a folder Folder-A1
  2. Repository B: github.com/username/repo-b
    • Contains a folder Folder-B1

The goal is to ensure that changes in Folder-A1 in Repository A are automatically synced to Folder-B1 in Repository B, and vice versa.

Prerequisites

  1. Personal Access Token (PAT): You’ll need a PAT with repository permissions to enable GitHub Actions to push changes to the repositories.
  2. GitHub Actions: GitHub’s CI/CD service that allows you to automate tasks directly in your repositories.

Steps to Set Up Syncing

1. Create a Personal Access Token

  1. Go to GitHub’s Personal Access Token settings.
  2. Generate a new token with the repo scope.

2. Add the PAT to Repository Secrets

  1. Navigate to your repository on GitHub.
  2. Go to Settings > Secrets > Actions.
  3. Add a new secret named PAT with the value of your generated token.

3. Configure GitHub Actions Workflows

We will create two workflows, one for each repository, to handle the synchronization.

Workflow for Repository A (Sync Folder-A1 to Folder-B1)

Create the workflow file in the repo-a repository:

# .github/workflows/sync_folder_a1_to_b1.yml
name: Sync Folder-A1 to Folder-B1

on:
  push:
    paths:
      - 'Folder-A1/**'
    branches:
      - main  # Change this to the default branch of your repository if it's different

jobs:
  sync:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout Repo A (repo-a)
      uses: actions/checkout@v3
      with:
        token: ${{ secrets.PAT }}
        fetch-depth: 0  # Ensure full history is fetched

    - name: Install rsync
      run: sudo apt-get install -y rsync

    - name: Checkout Repo B (repo-b)
      run: |
        git clone --branch master https://github.com/username/repo-b.git
        cd repo-b
        git config user.name "github-actions[bot]"
        git config user.email "github-actions[bot]@users.noreply.github.com"        

    - name: Compare and Sync Folder-A1 to Folder-B1
      run: |
        cd $GITHUB_WORKSPACE
        if git rev-parse HEAD~1 >/dev/null 2>&1; then
          git diff --name-only HEAD~1 -- Folder-A1/ > changed_files.txt
        else
          find Folder-A1 -type f > changed_files.txt
        fi
        cd repo-b
        while IFS= read -r file; do
          if [ -f "../$file" ]; then
            rsync -av --delete "../$file" "Folder-B1/${file#Folder-A1/}"
          else
            rm -f "Folder-B1/${file#Folder-A1/}"
          fi
        done < ../changed_files.txt
        git add Folder-B1
        git commit -m "Sync changes from Folder-A1 to Folder-B1"
        git push origin master        
      env:
        PAT: ${{ secrets.PAT }}

Workflow for Repository B (Sync Folder-B1 to Folder-A1)

Create the workflow file in the repo-b repository:

# .github/workflows/sync_folder_b1_to_a1.yml
name: Sync Folder-B1 to Folder-A1

on:
  push:
    paths:
      - 'Folder-B1/**'
    branches:
      - master  # Change this to the default branch of your repository if it's different

jobs:
  sync:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout Repo B (repo-b)
      uses: actions/checkout@v3
      with:
        token: ${{ secrets.PAT }}
        fetch-depth: 0  # Ensure full history is fetched

    - name: Install rsync
      run: sudo apt-get install -y rsync

    - name: Checkout Repo A (repo-a)
      run: |
        git clone --branch main https://github.com/username/repo-a.git
        cd repo-a
        git config user.name "github-actions[bot]"
        git config user.email "github-actions[bot]@users.noreply.github.com"        

    - name: Compare and Sync Folder-B1 to Folder-A1
      run: |
        cd $GITHUB_WORKSPACE
        if git rev-parse HEAD~1 >/dev/null 2>&1; then
          git diff --name-only HEAD~1 -- Folder-B1/ > changed_files.txt
        else
          find Folder-B1 -type f > changed_files.txt
        fi
        cd repo-a
        while IFS= read -r file; do
          if [ -f "../$file" ]; then
            rsync -av --delete "../$file" "Folder-A1/${file#Folder-B1/}"
          else
            rm -f "Folder-A1/${file#Folder-B1/}"
          fi
        done < ../changed_files.txt
        git add Folder-A1
        git commit -m "Sync changes from Folder-B1 to Folder-A1"
        git push origin main        
      env:
        PAT: ${{ secrets.PAT }}

Explanation of the Workflows

  1. Trigger: The workflows are triggered by a push event to specific paths and branches.
  2. Checkout Repositories: The actions/checkout@v3 action is used to checkout the repository’s code.
  3. Install rsync: rsync is used for efficient file synchronization.
  4. Clone the Target Repository: The target repository is cloned and configured for Git operations.
  5. File Comparison and Sync: Using git diff to find changes and rsync to sync the files.
  6. Commit and Push: The changes are committed and pushed back to the target repository.

This setup ensures that the specified folders in both repositories remain in sync whenever changes are made. This solution leverages GitHub Actions to automate the synchronization process, making it seamless and efficient.

Conclusion

Synchronizing specific folders between different repositories can be a crucial requirement for many projects. By using GitHub Actions, you can automate this process effectively, ensuring that changes are propagated across repositories without manual intervention. The steps outlined in this article provide a robust solution for maintaining consistency across multiple codebases.