Commit Graph

13 Commits

Author SHA1 Message Date
Haiyan Meng
f4636f8555 Add a fileType field into the index 2020-01-17 13:15:49 -08:00
Haiyan Meng
cf8d53a195 Move SeenMap to the utils dir 2020-01-15 15:29:16 -08:00
Haiyan Meng
2e895c147e Use log.Print* instead of fmt.Print* 2020-01-14 15:50:35 -08:00
Haiyan Meng
72eda992bd make seen a non-primitive type 2020-01-14 12:14:00 -08:00
Haiyan Meng
81d62f90bf Improve the efficency of crawling github
Make sure a github file is crawled once
2020-01-14 12:14:00 -08:00
Haiyan Meng
be2e03681d Remove unused param from IndexFunc 2019-12-18 15:56:44 -08:00
Haiyan Meng
a35f002139 Run goimports 2019-12-18 15:56:44 -08:00
Haiyan Meng
2c2aa928cc Delete non-existing documents from the index 2019-12-18 15:56:44 -08:00
Haiyan Meng
50ce2a66a3 Separate the two types of crawling
1) crawling the documents in the index to update these documents;
2) crawling the whole github.
2019-12-12 13:42:07 -08:00
Haiyan Meng
bffc0d7071 Mulitple improvements of the crawler
1) Set document IDs to avoid duplicating documents;
2) Set the `creationTime` field of each document in the index;
3) set the `values`, `kinds` and `identifiers` fields for all documents;
4) Add a `Copy` method into the `Document` struct: this fixes the issue
where all the documents existing in the index point to the same Document
object;
5) Avoid using keystore redis;
6) Set imagePullPolicy to `Always` for crawler jobs.
2019-12-11 11:10:48 -08:00
Jeffrey Regan
e9ab3da164 Fix some nits in the crawler and elsewhere. 2019-12-03 10:44:44 -08:00
Haiyan Meng
9255c991f4 Replace the sigs.k8s.io/kustomize/hack/crawl/* import path with
`sigs.k8s.io/kustomize/api/internal/crawl/*`
2019-11-14 13:38:18 -08:00
Haiyan Meng
f69d2d2e69 Move hack/crawl under api/internal 2019-11-14 13:17:28 -08:00