# GitHub

GitHub is a platform for version control and collaboration using Git. The Sling GitHub connector extracts data from the GitHub REST API, supporting repositories, issues, pull requests, commits, workflows, and more.

{% hint style="success" %}
**CLI Pro Required**: APIs require a [CLI Pro token](https://docs.slingdata.io/sling-cli/cli-pro) or [Platform Plan](https://docs.slingdata.io/sling-platform/platform).
{% endhint %}

## Setup

The following credentials and inputs are accepted:

**Secrets:**

* `access_token` **(required)** -> Your GitHub Personal Access Token (PAT)

**Inputs:**

* `owner` **(required)** -> The GitHub organization or username to extract data from
* `repositories` **(required)** -> List of repository names to extract data from
* `anchor_date` (optional) -> The starting date for historical data extraction (default: 1 year ago). Format: `YYYY-MM-DD`

### Getting Your Personal Access Token

1. Go to [GitHub Settings > Developer settings > Personal access tokens](https://github.com/settings/tokens)
2. Click "Generate new token" (classic) or "Generate new token (Beta)" for fine-grained tokens
3. Give your token a descriptive name (e.g., "Sling Integration")
4. For classic tokens, select these scopes:
   * `repo` - Full control of private repositories (or `public_repo` for public only)
   * `read:org` - Read organization membership
   * `read:user` - Read user profile data
   * `read:project` - Read project boards
5. Click "Generate token" and copy the token (it starts with `ghp_`)

### Using `sling conns`

Here are examples of setting a connection named `GITHUB`. We must provide the `type=api` property:

{% code overflow="wrap" %}

```bash
sling conns set GITHUB type=api spec=github \
  secrets='{ access_token: ghp_xxxxxxxxxxxx }' \
  inputs='{ owner: my-org, repositories: [repo1, repo2] }'
```

{% endcode %}

### Environment Variable

See [here](https://docs.slingdata.io/sling-cli/environment#dot-env-file-.env.sling) to learn more about the `.env.sling` file.

{% code overflow="wrap" %}

```bash
export GITHUB='{ type: api, spec: github, secrets: { access_token: "ghp_xxxxxxxxxxxx" }, inputs: { owner: "my-org", repositories: ["repo1", "repo2"] } }'
```

{% endcode %}

### Sling Env File YAML

See [here](https://docs.slingdata.io/sling-cli/environment#sling-env-file-env.yaml) to learn more about the sling `env.yaml` file.

```yaml
connections:
  GITHUB:
    type: api
    spec: github
    secrets:
      access_token: "ghp_xxxxxxxxxxxx"
    inputs:
      owner: my-org
      repositories:
        - repo1
        - repo2
        - repo3
```

**With anchor date for historical data:**

```yaml
connections:
  GITHUB:
    type: api
    spec: github
    secrets:
      access_token: "ghp_xxxxxxxxxxxx"
    inputs:
      owner: slingdata-io
      repositories:
        - sling
        - sling-cli
        - sling-api-spec
      anchor_date: "2019-01-01"
```

## Replication

Here's an example replication configuration to sync GitHub data to a PostgreSQL database:

```yaml
source: GITHUB
target: MY_POSTGRES

defaults:
  mode: incremental
  object: github.{stream_name}

streams:
  # sync all endpoints
  '*':

  # Exclude commits
  commits:
    disabled: true
  commit_comments:
    disabled: true
```

**Full refresh example for reference data:**

```yaml
source: GITHUB
target: MY_POSTGRES

defaults:
  mode: full-refresh
  object: github.{stream_name}

streams:
  # Core data
  repositories:
  issues:
  pull_requests:
  commits:

  # Issue details
  comments:
  issue_events:

  # PR details
  reviews:
  pull_request_stats:

  # CI/CD
  workflows:
  workflow_runs:
```

## Endpoints

| Endpoint                   | Description                                                           | Incremental |
| -------------------------- | --------------------------------------------------------------------- | ----------- |
| `organizations`            | Organizations the authenticated user belongs to                       | No          |
| `repositories`             | Repository details for configured repos                               | No          |
| `users`                    | Owner/organization profile information                                | No          |
| `assignees`                | Users who can be assigned to issues                                   | No          |
| `branches`                 | Repository branches                                                   | No          |
| `collaborators`            | Repository collaborators with permissions                             | No          |
| `issue_labels`             | Labels available in the repository                                    | No          |
| `tags`                     | Git tags                                                              | No          |
| `teams`                    | Organization teams                                                    | No          |
| `commits`                  | Repository commits                                                    | Yes         |
| `deployments`              | Deployment history                                                    | Yes         |
| `events`                   | Repository activity events                                            | Yes         |
| `issues`                   | Issues (includes PRs in API)                                          | Yes         |
| `issue_milestones`         | Issue milestones                                                      | Yes         |
| `projects`                 | Classic project boards                                                | Yes         |
| `pull_requests`            | Pull requests                                                         | Yes         |
| `releases`                 | Repository releases                                                   | Yes         |
| `stargazers`               | Users who starred the repo                                            | No          |
| `workflows`                | GitHub Actions workflows                                              | No          |
| `workflow_runs`            | Workflow execution history                                            | Yes         |
| `comments`                 | Issue comments                                                        | Yes         |
| `issue_events`             | Issue timeline events                                                 | No          |
| `issue_reactions`          | Reactions on issues                                                   | No          |
| `issue_timeline`           | Full timeline events for issues (comments, labels, assignments, etc.) | No          |
| `commit_comments`          | Comments on commits                                                   | No          |
| `pull_request_commits`     | Commits in a PR                                                       | No          |
| `pull_request_stats`       | PR statistics (additions, deletions, etc.)                            | No          |
| `reviews`                  | PR reviews                                                            | No          |
| `review_comments`          | PR review comments                                                    | Yes         |
| `project_columns`          | Project board columns                                                 | No          |
| `project_cards`            | Cards in project columns                                              | No          |
| `issue_comment_reactions`  | Reactions on issue comments                                           | No          |
| `commit_comment_reactions` | Reactions on commit comments                                          | No          |
| `review_comment_reactions` | Reactions on PR review comments                                       | No          |

To discover available endpoints:

```bash
sling conns discover GITHUB
```

## Rate Limiting

The GitHub API has rate limits:

* **Authenticated requests:** 5,000 requests per hour
* **Search API:** 30 requests per minute

The connector automatically:

* Uses conservative rate limiting (10 requests/second)
* Retries with exponential backoff on 429 (rate limit) responses
* Retries on 403 responses containing "rate limit"
* Checks remaining quota before starting and stops if below 500 requests

## Common Use Cases

### Sync Issues and PRs for Analytics

```yaml
source: GITHUB
target: MY_POSTGRES

defaults:
  mode: incremental
  object: analytics.{stream_name}

streams:
  issues:
  pull_requests:
  pull_request_stats:
  comments:
```

### CI/CD Pipeline Monitoring

```yaml
source: GITHUB
target: MY_POSTGRES

defaults:
  mode: incremental
  object: cicd.{stream_name}

streams:
  workflows:
    mode: full-refresh
  workflow_runs:
```

### Repository Metadata Sync

```yaml
source: GITHUB
target: MY_POSTGRES

defaults:
  mode: full-refresh
  object: github_meta.{stream_name}

streams:
  repositories:
  branches:
  tags:
  collaborators:
  issue_labels:
```

If you are facing issues connecting, please reach out to us at <support@slingdata.io>, on [discord](https://discord.gg/q5xtaSNDvp) or open a Github Issue [here](https://github.com/slingdata-io/sling-cli/issues).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.slingdata.io/connections/api-connections/github.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
