Files
kustomize/api/internal/crawl/config/crawler/job/job.yaml
Haiyan Meng bffc0d7071 Mulitple improvements of the crawler
1) Set document IDs to avoid duplicating documents;
2) Set the `creationTime` field of each document in the index;
3) set the `values`, `kinds` and `identifiers` fields for all documents;
4) Add a `Copy` method into the `Document` struct: this fixes the issue
where all the documents existing in the index point to the same Document
object;
5) Avoid using keystore redis;
6) Set imagePullPolicy to `Always` for crawler jobs.
2019-12-11 11:10:48 -08:00

34 lines
857 B
YAML

apiVersion: batch/v1
kind: Job
metadata:
name: crawler
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: crawler
image: gcr.io/kustomize-search/crawler:latest
imagePullPolicy: Always
env:
- name: GITHUB_ACCESS_TOKEN
valueFrom:
secretKeyRef:
name: github-access-token
key: token
- name: ELASTICSEARCH_URL
valueFrom:
configMapKeyRef:
name: elasticsearch-config
key: es-url
- name: REDIS_CACHE_URL
valueFrom:
configMapKeyRef:
name: crawler-http-cache
key: redis-cache-url
- name: REDIS_KEY_URL
valueFrom:
configMapKeyRef:
name: redis-keystore
key: keystore-url