Support diffrent modes of running the crawler

This commit is contained in:
Haiyan Meng
2019-12-17 14:35:44 -08:00
parent f5ff254203
commit 127541f610
3 changed files with 72 additions and 3 deletions

View File

@@ -0,0 +1,41 @@
There are three ways of running the crawler job.
# Crawling all the documents in the index and crawling all the kustomization files on Github
This is the default setting of the crawler job.
# Crawling all the documents in the index
Set the environment variable `CRAWL_INDEX_ONLY` to `true` like this:
```
- name: CRAWL_INDEX_ONLY
value: true
```
# Crawling all the kustomization files on Github
Set the environment variable `CRAWL_GITHUB_ONLY` to `true` like this:
```
- name: CRAWL_GITHUB_ONLY
value: true
```
# Crawling all the kustomization files in a Github repo
Add the environment variable `GITHUB_REPO` into the crawler container. For example:
```
- name: GITHUB_REPO
value: kubernetes-sigs/kustomize
```
# Crawling all the kustomization files in all the repositories of a Github user
Add the environment variable `GITHUB_USER` into the crawler container. For example:
```
- name: GITHUB_USER
value: kubernetes-sigs
```

View File

@@ -8,7 +8,7 @@ spec:
restartPolicy: OnFailure
containers:
- name: crawler
image: gcr.io/kustomize-search/crawler:latest
image: gcr.io/haiyanmeng-gke-dev/crawler:v1
imagePullPolicy: Always
env:
- name: GITHUB_ACCESS_TOKEN