Commit Graph

29 Commits

Author SHA1 Message Date
Haiyan Meng
272b7a6fcd Use UpdateRequest to insert/update a document
Currently, `IndexRequest` is used to insert/update a document, which
increases the version of the document every time IndexRequest.Do is
called.
2019-12-18 15:56:44 -08:00
Haiyan Meng
5598d35e4b Add a summary for doCrawl 2019-12-18 15:56:44 -08:00
Haiyan Meng
8c89f0946c Avoid to index a document if FetchDcoument or SetCreated fails 2019-12-18 15:56:44 -08:00
Haiyan Meng
12fc8f41c7 Add support for github paths starting with "git@github.com:" 2019-12-18 15:56:44 -08:00
Haiyan Meng
e44d1298df Return errors if http Client.Do resp status code is not 2xx 2019-12-18 15:56:44 -08:00
Jeffrey Regan
90597d56c9 Update go.sums 2019-12-16 11:37:35 -08:00
Jeffrey Regan
7e205b46b8 Update serialize-javascript 2019-12-13 14:45:43 -08:00
Jeff Regan
c6c099a9d1 Merge pull request #1948 from haiyanmeng/expose-es
Add supports for crawling a specific git user or repo
2019-12-13 13:24:03 -08:00
Haiyan Meng
a9244f759e Add supports for crawling a specific git user or repo 2019-12-13 11:18:33 -08:00
jregan
b8a13b6335 Pin to kustomize API v0.3.0 2019-12-12 18:41:05 -08:00
Haiyan Meng
50ce2a66a3 Separate the two types of crawling
1) crawling the documents in the index to update these documents;
2) crawling the whole github.
2019-12-12 13:42:07 -08:00
Jeffrey Regan
61c5afdf83 Update npm deps. 2019-12-12 13:18:30 -08:00
Haiyan Meng
d9239104aa Escape spaces in the query paths of git commit requests 2019-12-12 10:03:15 -08:00
Haiyan Meng
afd24c6faf Expose ElasticSearch as a LoadBalancer-type service 2019-12-11 15:05:10 -08:00
Haiyan Meng
0d79219e46 Avoid processing the nil pointer returned by kustomizationResultAdapter
Currently, the crawler job panics whenever a nil pointer is returned by
kustomizationResultAdapter.
2019-12-11 13:54:01 -08:00
Haiyan Meng
bffc0d7071 Mulitple improvements of the crawler
1) Set document IDs to avoid duplicating documents;
2) Set the `creationTime` field of each document in the index;
3) set the `values`, `kinds` and `identifiers` fields for all documents;
4) Add a `Copy` method into the `Document` struct: this fixes the issue
where all the documents existing in the index point to the same Document
object;
5) Avoid using keystore redis;
6) Set imagePullPolicy to `Always` for crawler jobs.
2019-12-11 11:10:48 -08:00
Haiyan Meng
d25b6ff3dc Remove duplicates in kinds 2019-12-04 11:24:54 -08:00
Haiyan Meng
68a196dbe5 Add a test case to demonstrate kinds have duplicates 2019-12-04 11:24:23 -08:00
Haiyan Meng
8aaa3f56f5 Set the ElasticSearch index creation configuration
Currently, the `kustomize` index in ElasticSearch is using dynamic
mapping, which sets the types of all the fields to `text`. However,
`text` field type is good for full-text value matching, and not good for
exact-value matching.  For exact-value matching, the `keyword` filed
type should be used.
2019-12-03 15:16:32 -08:00
Jeffrey Regan
e9ab3da164 Fix some nits in the crawler and elsewhere. 2019-12-03 10:44:44 -08:00
Haiyan Meng
9bba761a14 Add config for creating an ElasticSearch Cluster 2019-11-26 19:38:17 -08:00
Haiyan Meng
31c5e89b1f Add String method to KustomizationDocument to avoid printing the
content of kustomization.yaml
2019-11-26 14:49:44 -08:00
Haiyan Meng
84b75afae4 Make the crawler work
1) add the crawler binary and fix the crawler library
2) remove the readiness probe in the search backend
3) add config for redis keystore
4) add github_api_secret.txt file with instructions
2019-11-26 09:50:51 -08:00
Haiyan Meng
53b5e0f602 Fix dir paths in crawler backend Dockerfile 2019-11-18 13:35:16 -08:00
Haiyan Meng
aa09e3f3f9 Run go mod tidy 2019-11-14 13:57:29 -08:00
Haiyan Meng
8aaac77397 Update the module path in go.mod 2019-11-14 13:56:20 -08:00
Haiyan Meng
9255c991f4 Replace the sigs.k8s.io/kustomize/hack/crawl/* import path with
`sigs.k8s.io/kustomize/api/internal/crawl/*`
2019-11-14 13:38:18 -08:00
Haiyan Meng
d08140d3f7 Remove api/internal/hack/crawl/crawler/git dir, use api/internal/git
instead.
2019-11-14 13:35:00 -08:00
Haiyan Meng
f69d2d2e69 Move hack/crawl under api/internal 2019-11-14 13:17:28 -08:00