Haiyan Meng
745b58b3d0
Check whether a pointer is empty before accessing it to avoid SIGSEGV
2020-01-06 12:06:18 -08:00
Haiyan Meng
142c105500
SKip the empty resource/base item in a kustomization file and set the
...
defaultBranch if needed
2020-01-06 12:06:18 -08:00
Haiyan Meng
5f8a8b545b
Add "kustomization" into the kustomization filenames used by the crawler
2020-01-06 12:06:18 -08:00
Haiyan Meng
ee659a70e4
Fix how to construct URLs for finding all the commits related to a
...
github file
The existing logic sets the creation time of a github file to the time
when the github repository was created.
The fix sets the creation time of a github file to the time when the
file was created.
2020-01-06 12:06:18 -08:00
Haiyan Meng
be2e03681d
Remove unused param from IndexFunc
2019-12-18 15:56:44 -08:00
Haiyan Meng
127541f610
Support diffrent modes of running the crawler
2019-12-18 15:56:44 -08:00
Haiyan Meng
f5ff254203
Update deps
2019-12-18 15:56:44 -08:00
Haiyan Meng
a35f002139
Run goimports
2019-12-18 15:56:44 -08:00
Haiyan Meng
bef157d6b3
Fix insert/updating document logic
2019-12-18 15:56:44 -08:00
Haiyan Meng
2c2aa928cc
Delete non-existing documents from the index
2019-12-18 15:56:44 -08:00
Haiyan Meng
1eb713157c
Sort the string slice fields of a document to avoid updating the index
...
unnecessarily
2019-12-18 15:56:44 -08:00
Haiyan Meng
272b7a6fcd
Use UpdateRequest to insert/update a document
...
Currently, `IndexRequest` is used to insert/update a document, which
increases the version of the document every time IndexRequest.Do is
called.
2019-12-18 15:56:44 -08:00
Haiyan Meng
5598d35e4b
Add a summary for doCrawl
2019-12-18 15:56:44 -08:00
Haiyan Meng
8c89f0946c
Avoid to index a document if FetchDcoument or SetCreated fails
2019-12-18 15:56:44 -08:00
Haiyan Meng
12fc8f41c7
Add support for github paths starting with "git@github.com:"
2019-12-18 15:56:44 -08:00
Haiyan Meng
e44d1298df
Return errors if http Client.Do resp status code is not 2xx
2019-12-18 15:56:44 -08:00
Jeffrey Regan
90597d56c9
Update go.sums
2019-12-16 11:37:35 -08:00
Jeffrey Regan
7e205b46b8
Update serialize-javascript
2019-12-13 14:45:43 -08:00
Jeff Regan
c6c099a9d1
Merge pull request #1948 from haiyanmeng/expose-es
...
Add supports for crawling a specific git user or repo
2019-12-13 13:24:03 -08:00
Haiyan Meng
a9244f759e
Add supports for crawling a specific git user or repo
2019-12-13 11:18:33 -08:00
jregan
b8a13b6335
Pin to kustomize API v0.3.0
2019-12-12 18:41:05 -08:00
Haiyan Meng
50ce2a66a3
Separate the two types of crawling
...
1) crawling the documents in the index to update these documents;
2) crawling the whole github.
2019-12-12 13:42:07 -08:00
Jeffrey Regan
61c5afdf83
Update npm deps.
2019-12-12 13:18:30 -08:00
Haiyan Meng
d9239104aa
Escape spaces in the query paths of git commit requests
2019-12-12 10:03:15 -08:00
Haiyan Meng
afd24c6faf
Expose ElasticSearch as a LoadBalancer-type service
2019-12-11 15:05:10 -08:00
Haiyan Meng
0d79219e46
Avoid processing the nil pointer returned by kustomizationResultAdapter
...
Currently, the crawler job panics whenever a nil pointer is returned by
kustomizationResultAdapter.
2019-12-11 13:54:01 -08:00
Haiyan Meng
bffc0d7071
Mulitple improvements of the crawler
...
1) Set document IDs to avoid duplicating documents;
2) Set the `creationTime` field of each document in the index;
3) set the `values`, `kinds` and `identifiers` fields for all documents;
4) Add a `Copy` method into the `Document` struct: this fixes the issue
where all the documents existing in the index point to the same Document
object;
5) Avoid using keystore redis;
6) Set imagePullPolicy to `Always` for crawler jobs.
2019-12-11 11:10:48 -08:00
Haiyan Meng
d25b6ff3dc
Remove duplicates in kinds
2019-12-04 11:24:54 -08:00
Haiyan Meng
68a196dbe5
Add a test case to demonstrate kinds have duplicates
2019-12-04 11:24:23 -08:00
Haiyan Meng
8aaa3f56f5
Set the ElasticSearch index creation configuration
...
Currently, the `kustomize` index in ElasticSearch is using dynamic
mapping, which sets the types of all the fields to `text`. However,
`text` field type is good for full-text value matching, and not good for
exact-value matching. For exact-value matching, the `keyword` filed
type should be used.
2019-12-03 15:16:32 -08:00
Jeffrey Regan
e9ab3da164
Fix some nits in the crawler and elsewhere.
2019-12-03 10:44:44 -08:00
Haiyan Meng
9bba761a14
Add config for creating an ElasticSearch Cluster
2019-11-26 19:38:17 -08:00
Haiyan Meng
31c5e89b1f
Add String method to KustomizationDocument to avoid printing the
...
content of kustomization.yaml
2019-11-26 14:49:44 -08:00
Haiyan Meng
84b75afae4
Make the crawler work
...
1) add the crawler binary and fix the crawler library
2) remove the readiness probe in the search backend
3) add config for redis keystore
4) add github_api_secret.txt file with instructions
2019-11-26 09:50:51 -08:00
Haiyan Meng
53b5e0f602
Fix dir paths in crawler backend Dockerfile
2019-11-18 13:35:16 -08:00
Haiyan Meng
aa09e3f3f9
Run go mod tidy
2019-11-14 13:57:29 -08:00
Haiyan Meng
8aaac77397
Update the module path in go.mod
2019-11-14 13:56:20 -08:00
Haiyan Meng
9255c991f4
Replace the sigs.k8s.io/kustomize/hack/crawl/* import path with
...
`sigs.k8s.io/kustomize/api/internal/crawl/*`
2019-11-14 13:38:18 -08:00
Haiyan Meng
d08140d3f7
Remove api/internal/hack/crawl/crawler/git dir, use api/internal/git
...
instead.
2019-11-14 13:35:00 -08:00
Haiyan Meng
f69d2d2e69
Move hack/crawl under api/internal
2019-11-14 13:17:28 -08:00