Haiyan Meng
f4636f8555
Add a fileType field into the index
2020-01-17 13:15:49 -08:00
Haiyan Meng
cf8d53a195
Move SeenMap to the utils dir
2020-01-15 15:29:16 -08:00
Haiyan Meng
2e895c147e
Use log.Print* instead of fmt.Print*
2020-01-14 15:50:35 -08:00
Haiyan Meng
72eda992bd
make seen a non-primitive type
2020-01-14 12:14:00 -08:00
Haiyan Meng
81d62f90bf
Improve the efficency of crawling github
...
Make sure a github file is crawled once
2020-01-14 12:14:00 -08:00
Haiyan Meng
be2e03681d
Remove unused param from IndexFunc
2019-12-18 15:56:44 -08:00
Haiyan Meng
a35f002139
Run goimports
2019-12-18 15:56:44 -08:00
Haiyan Meng
2c2aa928cc
Delete non-existing documents from the index
2019-12-18 15:56:44 -08:00
Haiyan Meng
50ce2a66a3
Separate the two types of crawling
...
1) crawling the documents in the index to update these documents;
2) crawling the whole github.
2019-12-12 13:42:07 -08:00
Haiyan Meng
bffc0d7071
Mulitple improvements of the crawler
...
1) Set document IDs to avoid duplicating documents;
2) Set the `creationTime` field of each document in the index;
3) set the `values`, `kinds` and `identifiers` fields for all documents;
4) Add a `Copy` method into the `Document` struct: this fixes the issue
where all the documents existing in the index point to the same Document
object;
5) Avoid using keystore redis;
6) Set imagePullPolicy to `Always` for crawler jobs.
2019-12-11 11:10:48 -08:00
Jeffrey Regan
e9ab3da164
Fix some nits in the crawler and elsewhere.
2019-12-03 10:44:44 -08:00
Haiyan Meng
9255c991f4
Replace the sigs.k8s.io/kustomize/hack/crawl/* import path with
...
`sigs.k8s.io/kustomize/api/internal/crawl/*`
2019-11-14 13:38:18 -08:00
Haiyan Meng
f69d2d2e69
Move hack/crawl under api/internal
2019-11-14 13:17:28 -08:00