Commit Graph

35 Commits

Author SHA1 Message Date
Haiyan Meng
a83433d5cf Optimize memory usage by avoiding accumulating all the referred
documents into a single stack.
2020-06-23 11:36:29 -07:00
Haiyan Meng
2d496e0efe Update golang to 1.14 2020-06-23 11:25:37 -07:00
Haiyan Meng
171412cc98 Use RWMutex to control the map access
Without RWMutex, we may run into fatal error: concurrent map read and map write.
2020-06-23 11:25:37 -07:00
Haiyan Meng
c7bdb3fbe4 Add cmds to process the kustomize-stats log 2020-02-05 11:04:59 -08:00
Haiyan Meng
d0602c732b Remove the usage of github access token from the kustomize-stats job 2020-02-05 11:04:59 -08:00
Haiyan Meng
0b38e6d284 Improve the analysis on generator and transformer 2020-02-03 09:59:52 -08:00
Haiyan Meng
d5c66cb3d4 Add KustomizationDocument.Copy method 2020-02-03 09:59:52 -08:00
Haiyan Meng
b35b5aa73d Check the checksums of documents in the index 2020-02-03 09:59:52 -08:00
Haiyan Meng
154208d331 Improve the efficiency of crawling github by skipping the documents
already in the index
2020-01-24 19:55:56 -08:00
Haiyan Meng
f4636f8555 Add a fileType field into the index 2020-01-17 13:15:49 -08:00
Haiyan Meng
9f80da28ae Refactor the stats code for generators and transformers 2020-01-16 09:20:24 -08:00
Haiyan Meng
cf8d53a195 Move SeenMap to the utils dir 2020-01-15 15:29:16 -08:00
Haiyan Meng
29e50ab476 Collect stats on generators and transformers 2020-01-15 12:10:08 -08:00
Haiyan Meng
3519cc56a1 Add support to get files referred in the generators and tranformers
fields
2020-01-15 12:10:08 -08:00
Haiyan Meng
2e895c147e Use log.Print* instead of fmt.Print* 2020-01-14 15:50:35 -08:00
Haiyan Meng
af131c7471 Use flags to specify crawling mode and github user/repo info 2020-01-14 15:36:12 -08:00
Haiyan Meng
7ac573ae51 Add a flag to specify the index name 2020-01-14 14:25:29 -08:00
Haiyan Meng
72eda992bd make seen a non-primitive type 2020-01-14 12:14:00 -08:00
Haiyan Meng
594a3bf0d2 Add a binary for generating the stats of the index
1) how many kinds of objects are being customized?
2) how many times is every kind of object customized?
3) how many kustomization features are being used?
4) how many times is every kustomization feature used?
2020-01-07 15:10:25 -08:00
Jeff Regan
7190ea2688 Merge pull request #2038 from haiyanmeng/log-parser
Add a binary to parse GKE log
2020-01-07 14:57:40 -08:00
Jeff Regan
6bdb4fe2a6 Update main.go 2020-01-07 14:52:20 -08:00
Haiyan Meng
950660ff63 Add a binary to parse GKE log 2020-01-07 10:31:10 -08:00
Haiyan Meng
5f8a8b545b Add "kustomization" into the kustomization filenames used by the crawler 2020-01-06 12:06:18 -08:00
Haiyan Meng
be2e03681d Remove unused param from IndexFunc 2019-12-18 15:56:44 -08:00
Haiyan Meng
127541f610 Support diffrent modes of running the crawler 2019-12-18 15:56:44 -08:00
Haiyan Meng
bef157d6b3 Fix insert/updating document logic 2019-12-18 15:56:44 -08:00
Haiyan Meng
2c2aa928cc Delete non-existing documents from the index 2019-12-18 15:56:44 -08:00
Haiyan Meng
a9244f759e Add supports for crawling a specific git user or repo 2019-12-13 11:18:33 -08:00
Haiyan Meng
50ce2a66a3 Separate the two types of crawling
1) crawling the documents in the index to update these documents;
2) crawling the whole github.
2019-12-12 13:42:07 -08:00
Haiyan Meng
bffc0d7071 Mulitple improvements of the crawler
1) Set document IDs to avoid duplicating documents;
2) Set the `creationTime` field of each document in the index;
3) set the `values`, `kinds` and `identifiers` fields for all documents;
4) Add a `Copy` method into the `Document` struct: this fixes the issue
where all the documents existing in the index point to the same Document
object;
5) Avoid using keystore redis;
6) Set imagePullPolicy to `Always` for crawler jobs.
2019-12-11 11:10:48 -08:00
Jeffrey Regan
e9ab3da164 Fix some nits in the crawler and elsewhere. 2019-12-03 10:44:44 -08:00
Haiyan Meng
84b75afae4 Make the crawler work
1) add the crawler binary and fix the crawler library
2) remove the readiness probe in the search backend
3) add config for redis keystore
4) add github_api_secret.txt file with instructions
2019-11-26 09:50:51 -08:00
Haiyan Meng
53b5e0f602 Fix dir paths in crawler backend Dockerfile 2019-11-18 13:35:16 -08:00
Haiyan Meng
9255c991f4 Replace the sigs.k8s.io/kustomize/hack/crawl/* import path with
`sigs.k8s.io/kustomize/api/internal/crawl/*`
2019-11-14 13:38:18 -08:00
Haiyan Meng
f69d2d2e69 Move hack/crawl under api/internal 2019-11-14 13:17:28 -08:00