Haiyan Meng
9f80da28ae
Refactor the stats code for generators and transformers
2020-01-16 09:20:24 -08:00
Haiyan Meng
5477bde7e5
Use an env variable for index name and fix the call to NewKustomizeIndex in backend
2020-01-15 15:29:17 -08:00
Haiyan Meng
3ead42fe27
Add --index flag to kustomize_stats config file
2020-01-15 15:29:16 -08:00
Haiyan Meng
cf8d53a195
Move SeenMap to the utils dir
2020-01-15 15:29:16 -08:00
Haiyan Meng
aaaba99389
Use Document.Path instead of its fields
2020-01-15 12:10:08 -08:00
Haiyan Meng
29e50ab476
Collect stats on generators and transformers
2020-01-15 12:10:08 -08:00
Haiyan Meng
3519cc56a1
Add support to get files referred in the generators and tranformers
...
fields
2020-01-15 12:10:08 -08:00
Haiyan Meng
2e895c147e
Use log.Print* instead of fmt.Print*
2020-01-14 15:50:35 -08:00
Haiyan Meng
af131c7471
Use flags to specify crawling mode and github user/repo info
2020-01-14 15:36:12 -08:00
Haiyan Meng
7ac573ae51
Add a flag to specify the index name
2020-01-14 14:25:29 -08:00
Haiyan Meng
bb09f82f3c
Remove kustomize-index-name setting
2020-01-14 13:53:16 -08:00
Haiyan Meng
72eda992bd
make seen a non-primitive type
2020-01-14 12:14:00 -08:00
Haiyan Meng
230e0ca752
Add two methods to type RangeQueryResult: Add and String
2020-01-14 12:14:00 -08:00
Haiyan Meng
14eb524b9e
Add a command for searching for kustomize resource files
2020-01-14 12:14:00 -08:00
Haiyan Meng
81d62f90bf
Improve the efficency of crawling github
...
Make sure a github file is crawled once
2020-01-14 12:14:00 -08:00
Kubernetes Prow Robot
1a330f89d9
Merge pull request #2080 from yujunz/git-cloner
...
Simplify git cloner logic
2020-01-13 15:23:11 -08:00
Haiyan Meng
569fafba81
Add the Document ID pointing to a kuostomization root into cache to
...
avoid crawl it repeatedly
2020-01-11 15:32:25 -08:00
Yujun Zhang
ae458d0c80
Simplify git cloner logic
...
Related to #2072
2020-01-11 20:40:55 +08:00
Haiyan Meng
c801958d40
Log response status code to help debug
...
Recently, the crawler job often fails after 10+ hours with the following
error (10.0.47.27:9200 is the ElasticSearch master):
dial tcp 10.0.47.27:9200: connect: connection refused
2020-01-10 11:37:22 -08:00
Haiyan Meng
f9a4d5a14e
Track the crawling process
2020-01-10 11:10:38 -08:00
Jeff Regan
9555095de9
Merge pull request #2016 from haiyanmeng/stats
...
Add a binary for generating the stats of the index
2020-01-09 13:11:50 -08:00
Jeff Regan
a46046dac5
Merge pull request #2051 from haiyanmeng/nil
...
Two fixes of the crawler
2020-01-08 18:39:26 -08:00
Jeff Regan
6186e4edb7
Merge pull request #2017 from haiyanmeng/search
...
Add ElasticSearch query examples
2020-01-08 11:19:32 -08:00
Haiyan Meng
b154af8be4
Check the error of closing response body
2020-01-08 10:32:12 -08:00
Haiyan Meng
ccd129f7a5
Check empty http response before accessing it
2020-01-08 10:24:00 -08:00
Haiyan Meng
e2b56910f9
Add ElasticSearch query examples
2020-01-08 09:23:19 -08:00
Jeff Regan
32c280664d
Merge pull request #2025 from phanimarupaka/ConfigMapSpacesAndTabs
...
Trim trailing spaces and tabs from config map files
2020-01-07 15:53:31 -08:00
Haiyan Meng
594a3bf0d2
Add a binary for generating the stats of the index
...
1) how many kinds of objects are being customized?
2) how many times is every kind of object customized?
3) how many kustomization features are being used?
4) how many times is every kustomization feature used?
2020-01-07 15:10:25 -08:00
Jeff Regan
7190ea2688
Merge pull request #2038 from haiyanmeng/log-parser
...
Add a binary to parse GKE log
2020-01-07 14:57:40 -08:00
Jeff Regan
6bdb4fe2a6
Update main.go
2020-01-07 14:52:20 -08:00
Jeff Regan
bbceb49fc4
Merge pull request #2012 from julienp/master
...
Show namespace resource on id conflict
2020-01-07 11:41:01 -08:00
Haiyan Meng
950660ff63
Add a binary to parse GKE log
2020-01-07 10:31:10 -08:00
Kubernetes Prow Robot
f749a4a194
Merge pull request #2036 from pwittrock/fix-go-mod
...
Switch to api version 0.3.1
2020-01-07 10:08:18 -08:00
Phillip Wittrock
b1f514632a
Switch to api version 0.3.1
2020-01-07 08:54:05 -08:00
Haiyan Meng
745b58b3d0
Check whether a pointer is empty before accessing it to avoid SIGSEGV
2020-01-06 12:06:18 -08:00
Haiyan Meng
142c105500
SKip the empty resource/base item in a kustomization file and set the
...
defaultBranch if needed
2020-01-06 12:06:18 -08:00
Haiyan Meng
5f8a8b545b
Add "kustomization" into the kustomization filenames used by the crawler
2020-01-06 12:06:18 -08:00
Haiyan Meng
ee659a70e4
Fix how to construct URLs for finding all the commits related to a
...
github file
The existing logic sets the creation time of a github file to the time
when the github repository was created.
The fix sets the creation time of a github file to the time when the
file was created.
2020-01-06 12:06:18 -08:00
Phani Teja Marupaka
011804e14d
Make suggested changes
2020-01-02 13:06:14 -08:00
Phani Teja Marupaka
fa8f504ff4
Trim trailing spaces and tabs from config map files
2020-01-02 10:28:03 -08:00
Julien Poissonnier
0988f74d39
Show namespace resource on id conflict
2019-12-27 16:00:14 +01:00
Haiyan Meng
be2e03681d
Remove unused param from IndexFunc
2019-12-18 15:56:44 -08:00
Haiyan Meng
127541f610
Support diffrent modes of running the crawler
2019-12-18 15:56:44 -08:00
Haiyan Meng
f5ff254203
Update deps
2019-12-18 15:56:44 -08:00
Haiyan Meng
a35f002139
Run goimports
2019-12-18 15:56:44 -08:00
Haiyan Meng
bef157d6b3
Fix insert/updating document logic
2019-12-18 15:56:44 -08:00
Haiyan Meng
2c2aa928cc
Delete non-existing documents from the index
2019-12-18 15:56:44 -08:00
Haiyan Meng
1eb713157c
Sort the string slice fields of a document to avoid updating the index
...
unnecessarily
2019-12-18 15:56:44 -08:00
Haiyan Meng
272b7a6fcd
Use UpdateRequest to insert/update a document
...
Currently, `IndexRequest` is used to insert/update a document, which
increases the version of the document every time IndexRequest.Do is
called.
2019-12-18 15:56:44 -08:00
Haiyan Meng
5598d35e4b
Add a summary for doCrawl
2019-12-18 15:56:44 -08:00