# GitHub

GitHub is a platform for version control and collaboration using Git. The Sling GitHub connector extracts data from the GitHub REST API, supporting repositories, issues, pull requests, commits, workflows, and more.

{% hint style="success" %}
**CLI Pro Required**: APIs require a [CLI Pro token](https://docs.slingdata.io/sling-cli/cli-pro) or [Platform Plan](https://docs.slingdata.io/sling-platform/platform).
{% endhint %}

## Setup

The following credentials and inputs are accepted:

**Secrets:**

* `access_token` **(required)** -> Your GitHub Personal Access Token (PAT)

**Inputs:**

* `owner` **(required)** -> The GitHub organization or username to extract data from
* `repositories` **(required)** -> List of repository names to extract data from
* `anchor_date` (optional) -> The starting date for historical data extraction (default: 1 year ago). Format: `YYYY-MM-DD`

### Getting Your Personal Access Token

1. Go to [GitHub Settings > Developer settings > Personal access tokens](https://github.com/settings/tokens)
2. Click "Generate new token" (classic) or "Generate new token (Beta)" for fine-grained tokens
3. Give your token a descriptive name (e.g., "Sling Integration")
4. For classic tokens, select these scopes:
   * `repo` - Full control of private repositories (or `public_repo` for public only)
   * `read:org` - Read organization membership
   * `read:user` - Read user profile data
   * `read:project` - Read project boards
5. Click "Generate token" and copy the token (it starts with `ghp_`)

### Using `sling conns`

Here are examples of setting a connection named `GITHUB`. We must provide the `type=api` property:

{% code overflow="wrap" %}

```bash
sling conns set GITHUB type=api spec=github \
  secrets='{ access_token: ghp_xxxxxxxxxxxx }' \
  inputs='{ owner: my-org, repositories: [repo1, repo2] }'
```

{% endcode %}

### Environment Variable

See [here](https://docs.slingdata.io/sling-cli/environment#dot-env-file-.env.sling) to learn more about the `.env.sling` file.

{% code overflow="wrap" %}

```bash
export GITHUB='{ type: api, spec: github, secrets: { access_token: "ghp_xxxxxxxxxxxx" }, inputs: { owner: "my-org", repositories: ["repo1", "repo2"] } }'
```

{% endcode %}

### Sling Env File YAML

See [here](https://docs.slingdata.io/sling-cli/environment#sling-env-file-env.yaml) to learn more about the sling `env.yaml` file.

```yaml
connections:
  GITHUB:
    type: api
    spec: github
    secrets:
      access_token: "ghp_xxxxxxxxxxxx"
    inputs:
      owner: my-org
      repositories:
        - repo1
        - repo2
        - repo3
```

**With anchor date for historical data:**

```yaml
connections:
  GITHUB:
    type: api
    spec: github
    secrets:
      access_token: "ghp_xxxxxxxxxxxx"
    inputs:
      owner: slingdata-io
      repositories:
        - sling
        - sling-cli
        - sling-api-spec
      anchor_date: "2019-01-01"
```

## Replication

Here's an example replication configuration to sync GitHub data to a PostgreSQL database:

```yaml
source: GITHUB
target: MY_POSTGRES

defaults:
  mode: incremental
  object: github.{stream_name}

streams:
  # sync all endpoints
  '*':

  # Exclude commits
  commits:
    disabled: true
  commit_comments:
    disabled: true
```

**Full refresh example for reference data:**

```yaml
source: GITHUB
target: MY_POSTGRES

defaults:
  mode: full-refresh
  object: github.{stream_name}

streams:
  # Core data
  repositories:
  issues:
  pull_requests:
  commits:

  # Issue details
  comments:
  issue_events:

  # PR details
  reviews:
  pull_request_stats:

  # CI/CD
  workflows:
  workflow_runs:
```

## Endpoints

| Endpoint                   | Description                                                           | Incremental |
| -------------------------- | --------------------------------------------------------------------- | ----------- |
| `organizations`            | Organizations the authenticated user belongs to                       | No          |
| `repositories`             | Repository details for configured repos                               | No          |
| `users`                    | Owner/organization profile information                                | No          |
| `assignees`                | Users who can be assigned to issues                                   | No          |
| `branches`                 | Repository branches                                                   | No          |
| `collaborators`            | Repository collaborators with permissions                             | No          |
| `issue_labels`             | Labels available in the repository                                    | No          |
| `tags`                     | Git tags                                                              | No          |
| `teams`                    | Organization teams                                                    | No          |
| `commits`                  | Repository commits                                                    | Yes         |
| `deployments`              | Deployment history                                                    | Yes         |
| `events`                   | Repository activity events                                            | Yes         |
| `issues`                   | Issues (includes PRs in API)                                          | Yes         |
| `issue_milestones`         | Issue milestones                                                      | Yes         |
| `projects`                 | Classic project boards                                                | Yes         |
| `pull_requests`            | Pull requests                                                         | Yes         |
| `releases`                 | Repository releases                                                   | Yes         |
| `stargazers`               | Users who starred the repo                                            | No          |
| `workflows`                | GitHub Actions workflows                                              | No          |
| `workflow_runs`            | Workflow execution history                                            | Yes         |
| `comments`                 | Issue comments                                                        | Yes         |
| `issue_events`             | Issue timeline events                                                 | No          |
| `issue_reactions`          | Reactions on issues                                                   | No          |
| `issue_timeline`           | Full timeline events for issues (comments, labels, assignments, etc.) | No          |
| `commit_comments`          | Comments on commits                                                   | No          |
| `pull_request_commits`     | Commits in a PR                                                       | No          |
| `pull_request_stats`       | PR statistics (additions, deletions, etc.)                            | No          |
| `reviews`                  | PR reviews                                                            | No          |
| `review_comments`          | PR review comments                                                    | Yes         |
| `project_columns`          | Project board columns                                                 | No          |
| `project_cards`            | Cards in project columns                                              | No          |
| `issue_comment_reactions`  | Reactions on issue comments                                           | No          |
| `commit_comment_reactions` | Reactions on commit comments                                          | No          |
| `review_comment_reactions` | Reactions on PR review comments                                       | No          |

To discover available endpoints:

```bash
sling conns discover GITHUB
```

## Rate Limiting

The GitHub API has rate limits:

* **Authenticated requests:** 5,000 requests per hour
* **Search API:** 30 requests per minute

The connector automatically:

* Uses conservative rate limiting (10 requests/second)
* Retries with exponential backoff on 429 (rate limit) responses
* Retries on 403 responses containing "rate limit"
* Checks remaining quota before starting and stops if below 500 requests

## Common Use Cases

### Sync Issues and PRs for Analytics

```yaml
source: GITHUB
target: MY_POSTGRES

defaults:
  mode: incremental
  object: analytics.{stream_name}

streams:
  issues:
  pull_requests:
  pull_request_stats:
  comments:
```

### CI/CD Pipeline Monitoring

```yaml
source: GITHUB
target: MY_POSTGRES

defaults:
  mode: incremental
  object: cicd.{stream_name}

streams:
  workflows:
    mode: full-refresh
  workflow_runs:
```

### Repository Metadata Sync

```yaml
source: GITHUB
target: MY_POSTGRES

defaults:
  mode: full-refresh
  object: github_meta.{stream_name}

streams:
  repositories:
  branches:
  tags:
  collaborators:
  issue_labels:
```

If you are facing issues connecting, please reach out to us at <support@slingdata.io>, on [discord](https://discord.gg/q5xtaSNDvp) or open a Github Issue [here](https://github.com/slingdata-io/sling-cli/issues).
