Commit Graph

18 Commits

Author SHA1 Message Date
Haiyan Meng
d5c66cb3d4 Add KustomizationDocument.Copy method 2020-02-03 09:59:52 -08:00
Haiyan Meng
1120c6bc7a Add a User field into Document to make it easy to aggregate on github
user level.
2020-01-21 10:09:52 -08:00
Haiyan Meng
f4636f8555 Add a fileType field into the index 2020-01-17 13:15:49 -08:00
Haiyan Meng
cf8d53a195 Move SeenMap to the utils dir 2020-01-15 15:29:16 -08:00
Haiyan Meng
29e50ab476 Collect stats on generators and transformers 2020-01-15 12:10:08 -08:00
Haiyan Meng
3519cc56a1 Add support to get files referred in the generators and tranformers
fields
2020-01-15 12:10:08 -08:00
Haiyan Meng
2e895c147e Use log.Print* instead of fmt.Print* 2020-01-14 15:50:35 -08:00
Haiyan Meng
142c105500 SKip the empty resource/base item in a kustomization file and set the
defaultBranch if needed
2020-01-06 12:06:18 -08:00
Haiyan Meng
a35f002139 Run goimports 2019-12-18 15:56:44 -08:00
Haiyan Meng
1eb713157c Sort the string slice fields of a document to avoid updating the index
unnecessarily
2019-12-18 15:56:44 -08:00
Haiyan Meng
8c89f0946c Avoid to index a document if FetchDcoument or SetCreated fails 2019-12-18 15:56:44 -08:00
Haiyan Meng
12fc8f41c7 Add support for github paths starting with "git@github.com:" 2019-12-18 15:56:44 -08:00
Haiyan Meng
bffc0d7071 Mulitple improvements of the crawler
1) Set document IDs to avoid duplicating documents;
2) Set the `creationTime` field of each document in the index;
3) set the `values`, `kinds` and `identifiers` fields for all documents;
4) Add a `Copy` method into the `Document` struct: this fixes the issue
where all the documents existing in the index point to the same Document
object;
5) Avoid using keystore redis;
6) Set imagePullPolicy to `Always` for crawler jobs.
2019-12-11 11:10:48 -08:00
Haiyan Meng
d25b6ff3dc Remove duplicates in kinds 2019-12-04 11:24:54 -08:00
Haiyan Meng
68a196dbe5 Add a test case to demonstrate kinds have duplicates 2019-12-04 11:24:23 -08:00
Haiyan Meng
31c5e89b1f Add String method to KustomizationDocument to avoid printing the
content of kustomization.yaml
2019-11-26 14:49:44 -08:00
Haiyan Meng
d08140d3f7 Remove api/internal/hack/crawl/crawler/git dir, use api/internal/git
instead.
2019-11-14 13:35:00 -08:00
Haiyan Meng
f69d2d2e69 Move hack/crawl under api/internal 2019-11-14 13:17:28 -08:00