GitHub

Connect & Ingest data from GitHub

GitHub is a platform for version control and collaboration using Git. The Sling GitHub connector extracts data from the GitHub REST API, supporting repositories, issues, pull requests, commits, workflows, and more.

Setup

The following credentials and inputs are accepted:

Secrets:

  • access_token (required) -> Your GitHub Personal Access Token (PAT)

Inputs:

  • owner (required) -> The GitHub organization or username to extract data from

  • repositories (required) -> List of repository names to extract data from

  • anchor_date (optional) -> The starting date for historical data extraction (default: 1 year ago). Format: YYYY-MM-DD

Getting Your Personal Access Token

  1. Click "Generate new token" (classic) or "Generate new token (Beta)" for fine-grained tokens

  2. Give your token a descriptive name (e.g., "Sling Integration")

  3. For classic tokens, select these scopes:

    • repo - Full control of private repositories (or public_repo for public only)

    • read:org - Read organization membership

    • read:user - Read user profile data

    • read:project - Read project boards

  4. Click "Generate token" and copy the token (it starts with ghp_)

Using sling conns

Here are examples of setting a connection named GITHUB. We must provide the type=api property:

sling conns set GITHUB type=api spec=github \
  secrets='{ access_token: ghp_xxxxxxxxxxxx }' \
  inputs='{ owner: my-org, repositories: [repo1, repo2] }'

Environment Variable

export GITHUB='{ type: api, spec: github, secrets: { access_token: "ghp_xxxxxxxxxxxx" }, inputs: { owner: "my-org", repositories: ["repo1", "repo2"] } }'

Sling Env File YAML

See here to learn more about the sling env.yaml file.

connections:
  GITHUB:
    type: api
    spec: github
    secrets:
      access_token: "ghp_xxxxxxxxxxxx"
    inputs:
      owner: my-org
      repositories:
        - repo1
        - repo2
        - repo3

With anchor date for historical data:

connections:
  GITHUB:
    type: api
    spec: github
    secrets:
      access_token: "ghp_xxxxxxxxxxxx"
    inputs:
      owner: slingdata-io
      repositories:
        - sling
        - sling-cli
        - sling-api-spec
      anchor_date: "2019-01-01"

Replication

Here's an example replication configuration to sync GitHub data to a PostgreSQL database:

source: GITHUB
target: MY_POSTGRES

defaults:
  mode: incremental
  object: github.{stream_table}

streams:
  # sync all endpoints
  '*':

  # Exclude commits
  commits:
    disabled: true
  commit_comments:
    disabled: true

Full refresh example for reference data:

source: GITHUB
target: MY_POSTGRES

defaults:
  mode: full-refresh
  object: github.{stream_table}

streams:
  # Core data
  repositories:
  issues:
  pull_requests:
  commits:

  # Issue details
  comments:
  issue_events:

  # PR details
  reviews:
  pull_request_stats:

  # CI/CD
  workflows:
  workflow_runs:

Endpoints

Endpoint
Description
Incremental

organizations

Organizations the authenticated user belongs to

No

repositories

Repository details for configured repos

No

users

Owner/organization profile information

No

assignees

Users who can be assigned to issues

No

branches

Repository branches

No

collaborators

Repository collaborators with permissions

No

issue_labels

Labels available in the repository

No

tags

Git tags

No

teams

Organization teams

No

commits

Repository commits

Yes

deployments

Deployment history

Yes

events

Repository activity events

Yes

issues

Issues (includes PRs in API)

Yes

issue_milestones

Issue milestones

Yes

projects

Classic project boards

Yes

pull_requests

Pull requests

Yes

releases

Repository releases

Yes

stargazers

Users who starred the repo

No

workflows

GitHub Actions workflows

No

workflow_runs

Workflow execution history

Yes

comments

Issue comments

Yes

issue_events

Issue timeline events

No

issue_reactions

Reactions on issues

No

issue_timeline

Full timeline events for issues (comments, labels, assignments, etc.)

No

commit_comments

Comments on commits

No

pull_request_commits

Commits in a PR

No

pull_request_stats

PR statistics (additions, deletions, etc.)

No

reviews

PR reviews

No

review_comments

PR review comments

Yes

project_columns

Project board columns

No

project_cards

Cards in project columns

No

issue_comment_reactions

Reactions on issue comments

No

commit_comment_reactions

Reactions on commit comments

No

review_comment_reactions

Reactions on PR review comments

No

To discover available endpoints:

sling conns discover GITHUB

Rate Limiting

The GitHub API has rate limits:

  • Authenticated requests: 5,000 requests per hour

  • Search API: 30 requests per minute

The connector automatically:

  • Uses conservative rate limiting (10 requests/second)

  • Retries with exponential backoff on 429 (rate limit) responses

  • Retries on 403 responses containing "rate limit"

  • Checks remaining quota before starting and stops if below 500 requests

Common Use Cases

Sync Issues and PRs for Analytics

source: GITHUB
target: MY_POSTGRES

defaults:
  mode: incremental
  object: analytics.{stream_table}

streams:
  issues:
  pull_requests:
  pull_request_stats:
  comments:

CI/CD Pipeline Monitoring

source: GITHUB
target: MY_POSTGRES

defaults:
  mode: incremental
  object: cicd.{stream_table}

streams:
  workflows:
    mode: full-refresh
  workflow_runs:

Repository Metadata Sync

source: GITHUB
target: MY_POSTGRES

defaults:
  mode: full-refresh
  object: github_meta.{stream_table}

streams:
  repositories:
  branches:
  tags:
  collaborators:
  issue_labels:

If you are facing issues connecting, please reach out to us at [email protected], on discord or open a Github Issue here.

Last updated

Was this helpful?