Delete internal/crawl.

This commit is contained in:
monopole
2021-04-23 19:24:15 -07:00
parent e86fd7f009
commit c6f575ce37
124 changed files with 0 additions and 24414 deletions

View File

@@ -1,428 +0,0 @@
## What is this?
### In short
Be the GoDoc.org of k8s configuration files.
### More explicitly
Support k8s document indexing from open-source configurations in order to make
it easy for people to learn to use a new feature, explore k8s configs in a
central hub, and see some metrics about kustomize use.
We want people to be able to support three main classes of queries:
1. Structured document queries: how should I use the following fields
- Grace periods: `spec:template:spec:terminationGracePeriod`?
- Kustomize inline patch: `patches:patch`?
2. Key value queries: how should I use this more specific use case of a
structure configuration.
- HorizontalPodAutoScalers: `kind=HorizontalPodAutoScaler`?
- Patches on StatefulSets: `patches:target:kind=StatefulSet`?
3. Full text search: search the comments and the document text from any
type of k8s config file.
## Road map
There is a lot that can be added in order to improve the state of this
application. Some more details along with general thoughts and comments can be
found in the Roadmap.md file in this directory. This README contains only
what can be considered as mostly complete and iterable parts of this project.
## Running this project
Everything is configured using kubernetes, so it should be easy for people to
spin this up on any k8s cluster. Everything should just work (TM).
The config files live in the `config` directory.
```
config
├── base
│   └── kustomization.yaml
├── crawler
│   ├── base
│   │   ├── github_api_secret.txt
│   │   └── kustomization.yaml
│   ├── cronjob
│   │   ├── cronjob.yaml
│   │   └── kustomization.yaml
│   └── job
│   ├── job.yaml
│   └── kustomization.yaml
├── elastic
│   └── ...
├── redis
│   ├── document_keystore
│   │   ├── kustomization.yaml
│   │   ├── redis.yaml
│   │   └── service.yaml
│   └── http_cache
│      ├── kustomization.yaml
│      ├── redis.yaml
│      └── service.yaml
├── webapp
│ ├── backend
│ │   ├── deployment.yaml
│ │   ├── kustomization.yaml
│ │   └── service.yaml
│ └── frontend
   ├── deployment.yaml
   ├── kustomization.yaml
   └── service.yaml
└── schema_files
   └── kustomization_index
      ├── es_index_mappings.json
      └── es_index_settings.json
```
To get everything up and running you have to:
1. Get some instance of elasticsearch working... and configure the
configmapGenerator in `config/base` to point to the right endpoint(s). The
configurations that need this value to be populated are the following:
- `config/crawler/cronjob` to run periodic crawls.
- `config/crawler/job` to run crawls on demand.
- `config/webapp/backend` to run the search server.
2. Configure the elasticsearch indices:
```
kustomize build config/schema_files/kustomization_index | kubectl apply -f -
```
This will run a `curl` command that reads json data from a ConfigMap. This will
setup the schema. If you want to make more complex modifications to the
schema, you should refer to the elastic docs to figure out whether the mapping
can be added to the current index, or whether you will need to copy the
existing index into a different one with the appropriate mappings. Modifications
can be made by using the elasticsearch go library and writing a simple program,
or it can be made with any http command to the appropriate server endpoint from
within the cluster. Unfortunately I did not have the time to write a few helper
tools for this. Feel free to contact me if you need help with modifying
elasticsearch configs, I'm by no means an expert, but I can try to help.
3. (Optional) run the redis http chache for the crawler:
```
kubectl apply -k config/redis/http_cache
```
This will create a deployment for the cache, and a service. The crawler should
be configured to connect to the `http_cache` if it exists, but you can always
check the logs to make sure it connects, and that the identifiers match in the
crawler configuration and for the service endpoint.
The please be aware that the cache does not have a persistent volume.
4. Configure the main redis instance:
```
kubectl apply -k config/redis/document_keystore
```
This will create a StatefulSet with a volume of 4GiB for a redis instance.
5. Get an access token from GitHub.
To be able to kindly ask GitHub for it's data on k8s config files, you'll need
to create an access\_token. From my understanding, this is the only way to do
these code search queries (without first specifying a repository).
To generate a token, go to your GitHub's account in Settings > Developer
Settings > Personal access tokens. It should look like this.
![GitHub Token 1](
https://sigs.k8s.io/kustomize/internal/tools/pictures/github_token.png)
From here you want to generate a new token and have the following
configuration:
![GitHub Token 1](
https://sigs.k8s.io/kustomize/internal/tools/pictures/token_config.png)
If you have uses for any other data from this token, (org data, or something
else) you can pick and choose, but be careful since it can grant this
application access to your notifications, etc. However, any such extension
is explicitly a non-goal and would not be maintained by this project.
6. Launch the crawler:
```
kustomize build config/crawler/cronjob | kubectl apply -f -
```
This will periodically run the crawler every day according to the cron timing
rules in the cronjob.yaml file.
Instead, to get the crawler running now, you can run:
```
kustomize build config/crawler/cronjob | kubectl apply -f -
```
which will launch a non-periodic version of the crawler. It will take a few
minutes for the crawler to split the search, but then config files should
start to get populated within 20 minutes. It may take a while to do the
first crawl, since it has to fetch rate-limited endpoints for each new file it
finds. It should get significantly faster to update in the future.
5. Launch the search backend
```
kustomize build config/webapp/backend | kubectl apply -f -
```
6. Launch the search frontend
```
kustomize build config/webapp/frontend | kubectl apply -f -
```
## Notes about the components
### Elasticsearch
I will add a basic working setup soon. I just did the lazy thing and used an
already packaged solution. Most clouds will provide their own elastic
environments, however, Elasticsearch is also working on their own
implementation of a
![k8s operator](https://www.elastic.co/elasticsearch-kubernetes), which might
be worth checking out. Please note that it comes with its own license
agreement.
### Redis
There are two Redis instances that are used in this application.
One of them is configured to have on disk persistence, so make sure to have
that set up in your kubernetes cluster. Also note that it is running on a
single master node (i.e. it does not automatically shard keys to multiple head
nodes as part of a highly available cluster). Since it's storing a sparse
graph, I can't imagine this being much of an issue, but it's probably worth
mentioning.
The other Redis instance is running as a HTTP (RFC 7234) cache for etags from
GitHub (or any other document store from which we could crawl/index). This one
does not require full persistent storage on disk. The caching strategy is an
LRU cache which is probably a good starting point. It might be worth it to
investigate other cache policies, but I think LRU will work well since
documents may or may not expire anyway, and the amount of memory allocated for
keys is fairly large, so eviction of frequently used documents seems unlikely
anyway.
### Nginx + Angular
There is a Dockerfile included for generating the container image with Nginx
(using the default package) and adding all of the supporting compiled angular
files. Any modifications to the code-base should be compatible with this setup,
so all that's needed is to rebuild the container image, and possibly modify
the image tags in the k8s file.
### Supporting Go binaries
There are a few go binaries that each have their own Dockerfile to build
containers in which to run them on k8s, namely the crawler and the search
service. Their configurations are not optimal (read: needs to be cleaned up),
but they are functional.
## Technical details
### Overall design and implementations
There are a few components that are all running together in order to get
the overall application to work smoothly. This section will provide a brief
overview of each component with the following sections going into more details.
The overall structure is outlined in the following figure:
![overview](
https://github.com/kubernetes-sigs/kustomize/blob/master/api/internal/crawl/pictures/token_config.png)
#### Crawler
The leftmost component consists of a crawler with an http cache of GitHub
queries does two things, it first looks at the list of documents in
elasticsearch and tries to update them. In doing so, it maintains a set of
newly updated files to exclude them from other parts of the crawl.
To find newly added documents, the crawler crawls any new dependencies
introduced in the document updating step and it also queries GitHub for the
most recently indexed kustomization.\* files. Each new file will be processed
for efficient text queries and put into the document index. Any new dependency
will also incur more crawl operations. Finally, a graphical
representation of the documents and their dependencies is built in Redis to be
used for graph algorithms such as PageRank and component analysis.
#### Data library
There are a few helper libaries for dealing with Elasticsearch, Redis and
documents. This is not persistent, nor is it centralized. They act as small
components that help to package common pieces of code. Eventually it may make
sense to merge all of it together and make a proper persistent model around
this while providing an external API for document insertion/deletion. But
that is definitely out of scope in terms of getting this to run. However
there are limitations with the current model in terms of minimizing the
API surface for the different components of the application. For now this
problem is mostly mitigated by having the query server only connected to
a data node of the Elasticsearch cluster, but the problem of knowing what
is accessible and what isn't is left to the programmer instead of being
clearly and explicitly supported by the API.
#### Server
Uses the data library to communicate with the data store and answer queries.
Processes the user entered text queries into somewhat optimized elasticsearch
queries. Provides a few endpoints to get different metrics and to eventually
allow for registration of remote repositories.
This application has an exposing service in order to allow users of the
application access to queries and the results.
#### Nginx + Angular
Communicates directly with the backend server to forward user queries and
their results. Presents the results on an interface. It's still pretty simple
looking but it seems usable (to me).
### Crawling GitHub
With the use of API keys, GitHub allows account owners to search for files
using their API.
The search endpoints allow for the use of metadata search
that is fairly useful/powerful. For instance they provide a `filename:` keyword
that permits us to look for `kustomization.yaml`, `kustomization.yml`, etc.
This enables the fetching of a list of kustomization documents, from which
we can get the actual content from another endpoint
(raw.githubusercontent.com).
However, the search API is fairly limited. There is a restriction to the number
of documents that can be retrieved from this method. One possible way to
mitigate this would be to periodically query GitHub for results, sorted by the
last indexed time. This would allow you to collect most documents from this
point forwards. The downside to this is that it may require a large number of
requests to their API since you cannot know when new files will be added.
Furthermore, there is a possibility that you would not be able to get all of
files either, depending on the velocity of growth.
The approach that was taken to mitigate this is to use the `filesize:` keyword
and to shard the search space into contiguous buckets of appropriate size in
order to get all of the documents. This is fairly efficient, since you can find
a good enough way to shard the documents in
`lg(max file size) * number of documents / 1000` API queries. Moreover, since
queries are paginated with at most 100 results per query, this solution is
competitive with getting the optimal (non-contiguous) sharding of result sets.
Furthermore, filesize queries can be cached to minimize the total number of
queries called to the API in order to shard the search space. This is done by
querying for file size intervals that always start with 0..X and binary
searching over the `filesize:` space. This will allow you to reuse a lot of
queries when you're looking for the next range, since it is upper bounded and
lower bounded to a smaller number of queries within a range that has also been
queried. I think this is only true because filesizes are power law distributed,
so searches will typically require less queries as they progress from left to
right.
However, this method in no way depends on intervals of the form 0..X, as
the number of documents in the many intervals of the range search could be
added together to also make this work. This approach just seemed simpler to
implement, maintain, and debug so it was preferred.
To get an idea of how efficient this method is, to shard the search space of
7000 documents, it will only take ~90 API range queries which should only take
a few minutes. While actually fetching the documents and their relevant
metadata (creation time, etc.) will take several hours. Furthermore, this
could be made more efficient if a prior distribution is approximated.
This prior could be scaled to the number of documents that need to be fetched,
and then finding a shard that has an adequate number of requests, will only
take a few queries per shard. It could probably be supported in a constant
number of size queries if the size of each shard is halved which shouldn't
have terrible performance impact for the retrieval. However, there where
more pressing things to implement. I might revisit this later.
### Document Indexing and Processing
In order to support simple text queries the structured documents must be
processed in some way that makes searching them easy. The current method
is to recursively traverse the map of configurations to generate each sub-path
and each key-value pair for the leaf nodes of the recursion tree.
However, note that this means that a document has to be valid yaml/json
format in order for indexing to happen. The rest of the document is treated
as mostly text and uses default text settings from Elasticsearch.
What this means is that for the following yaml document:
```yaml
resources:
- service.yaml
- deployment.yaml
configmapGenerator:
- name: app-configuration
files:
- config.yaml
patchesJson6902:
- target:
version: v1
kind: StatefulSet
name: ss-name
path: ss-patch.yaml
- target:
version: v1
kind: Deployment
name: dep-name
path: dep-patch.yaml
```
the following flattened structure would look like:
```
{
"identifiers": [
"resources",
"configmapGenerator",
"configmapGenerator:name",
"configmapGenerator:files",
"patchesJson6902",
"patchesJson6902:target",
"patchesJson6902:target:version",
"patchesJson6902:target:kind",
"patchesJson6902:target:name",
"patchesJson6902:path",
],
"values": [
"resources=service.yaml",
"resources=deployment.yaml",
"configmapGenerator:name=app-configuration",
"configmapGenerator:files=config.yaml",
"patchesJson6902:target:version=v1",
"patchesJson6902:target:kind=StatefulSet",
"patchesJson6902:target:name=ss-name",
"patchesJson6902:path=ss-patch.yaml",
"patchesJson6902:target:kind=Deployment",
"patchesJson6902:target:name=dep-name",
"patchesJson6902:path=dep-patch.yaml",
],
...
}
```
Note that unique paths and values are deduplicated.
On the search side, exact queries will be prioritized, but the document paths
and key=value pairs will also be analyzed with 3-grams to have some amount of
fuzzy search. The reason that a Levenshtein-Distance was not used instead, is due
to searching multiple fields at the same time, which is a use case where
Elasticsearch does not support proper fuzzy searching.
### Document Search
Given a text query, each token is considered separately. Each token will be fed
through a handful of analyzers on the Elasticsearch side, and will be compared
with the reverse document index of each document fields. It will then determine
the best matching documents. Text ordering is largely insignificant. This makes
sense for the structured search, but may leave room for improvement for the
text only search within the document.
Each token _must_ be matched, so each white space character acts as a
conjunction of individual queries. There are also ways of telling
Elasticsearch that some things _should_ match, but I think for now it makes
more sense to leave it as is.
I think this behavior is sufficient to make the search feel fairly intuitive
while providing support for fairly complex use cases.
### Metrics Computation
From the each kustomization document that is indexed, we can find it's
resources that are publicly available. This includes other kustomizations.
From this, we can build a directed graph of dependencies and reverse
dependencies.
This opens up the possibility to add a plethora of graph metrics that can
give the project maintainers feedback and insight into how people are using
their tools.
Some of these are useful such as getting an idea for how large the dependency
graphs actually grow in practice, and can be used to find _popular_
kustomizations within the corpus. This lends itself to implementing PageRank
to help bubble up popular results as good search results. I unfortunately
did not have the time to implement the algorithm, but I do plan to revisit
this sometime soon to add a few good and efficient implementations of useful
graph algorithms that would be useful to have. See the Roadmap.md for a more
complete list of features that could be added and how I think they could be
implemented.

View File

@@ -1,176 +0,0 @@
# Road map and comments about this work
From working on this project, here is a collection of thoughts and suggestions
for future improvements. For any questions about this, or to request help do
not hesitate to contact @damienr74 on GitHub, my email should be listed.
I think this project has the potential for the K8s community to promote best
practices. If this becomes popular, It could become easier to find
*subjectively good* configurations. This can act as a way to guide newcomers
to k8s config features that are easy to maintain, practical, and tested in some
real world environment. However, a lot of work remains to be made if this is
to happen. Extracting and ranking semantic-level information from the open
source configuration files, is definitely not trivial, and will require a lot of
though and consideration from the experts and the patterns that successful k8s
project follow. This, is outside of my scope having little to no experience with
k8s other than working on this project; however, if you have ideas I can
probably suggest approaches in order to implement it, having worked a lot on
this project.
### Improving configuration files and container configs
I did not have a lot of time to refactor the images to use configmaps for
everything. This is a good thing to improve, should be fairly easy. Another
thing that could make the user experience of launcing this could be to make all
of the go utilities be subcommands to the same binary/container image. This
would reduce the number of things that would have to be rebuilt, in order to get
it running, and it would make the application (and its components) more self
contained. (also has some disadvantages, so I'll let someone else decide.
### Adding graph metrics
From the Redis graph representation, we are able to run a multitude of graph
algorithms (not all of which are implemented).
The simplest one would be to run kruskal's algorithm to find connected
components, and to compute graph metrics on each component. Here are some of the
metrics that may be useful:
+ Average size and histograms of the sizes of each components.
+ Average size and histograms of the node with the highest in degree (rdeps) of
each component.
+ Average size and histograms of the number of repositories in a connected
component.
+ Any other metric that may be helpful to measure the scale of the kustomize
import graph.
Another cool thing that may be helpful, would be to output the graph
representation of deps/rdeps. This should be fairly easy to do with graphviz/dot
so if anyone really wants this, I (damienr74) should be able to do it. Feel free
to send me an email or to @ mention me in an issue.
Note: dfs could also be used to find connected components, but I think union
find is preferable, since the results can be stored and modified very
efficiently. The only challenging part would be to implement deleting of edges
and nodes from a component efficiently, but I know it is possible to support
these operations with a union find structure.
### Implementing PageRank
The graph is set up to be able to efficiently compute PageRank since the edge
weights are real valued, and the graph representation is sparse which means that
it will fit in the memory of a single machine which will make the processing
much more efficient.
It could also be implemented as a Redis script, but I feel like there's
something fundamentally wrong with implementing PageRank in lua. :P
### Implement feature tracking
Each day, when the crawler finds and indexes these structured documents,
it should insert aggregate data to a separate index. This data could look like the
following:
```
{
"kind": "kustomization",
"added_identifiers": [
{
"identifier": "some:new:k8s:feature",
"addedIn": [
"docID1",
"docID100",
"docID45",
...
],
},
{
"identifier": "another:k8s:feature",
"documents": [
...
],
},
...
]
"removed_identifiers": [
{
"identifier": "some:deprecated:field",
"documents": [
...
]
}
]
}
```
This would make it fairly easy to get deep insight into:
- the speed at which things can effectively be deprecated.
- how many people are migrating to current best practices.
- how many documents get updated frequently/rarely.
- detailed cross sections of growth/regression over conjunctions of features.
- a world of possibilities.
This is also something that I would be interested to work on sometime soon, so
feel free to contact me (damienr74) or ask questions about this.
As needed, it could be a good idea to also aggregate past data with a larger
granularity. for instance each month, the past 30 days can be aggregated into
weekish durations, And every year these weekly aggregations can be converted
into monthly summaries depending on how much data this ends up being, and how
much you want to pay for the storage of this data.
Another cool way to compress this data would be to dynamically compress this
data into a logarithmic number of buckets with decreasing granularity. But it
seems like overkill for the amount of data that we'd likely get.
### The UI probably needs a lot of work
I'm not much of a UI/UX person and have little to no experience in developing
these types of applications. If anyone with Angular experience wants to dive in
and completely restructure the app to make the UI/UX/Code health better that
would be greatly appreciated.
### Query tuning probably still has to be adjusted
I'm also not an expert in Elasticsearch. From what I could read in the docs,
I think I've made sane decisions in converting user queries into meaningful
Elasticsearch queries, but I'm sure there are a lot of improvements that remain
to be done in order to get more accurate results.
### Some other signals that indicate the presence of a good configuration file
There are lots of heuristics that could be used to achieve this. Here are a
couple in no particular order:
+ Penalize for the number of yaml `---` document splits. I'm not sure what the
general consensus is, but I think it's better to separate them, since it
makes git commits less noisy, it's a trivial transformation, and it makes
config files smaller. However, I can understand the argument that its somewhat
practical to keep an overall view of the configurations together (maybe).
+ Penalize the number of unique identifiers in a structured document. I think
this makes sense, since we don't want to have someone game the search engine
to match documents with every possible path from the k8s docs. PageRank might
help with this to some extent, but with a small corpus it would be fairly easy
to game.
+ Assign weights to the usefulness of certain fields. It would be good to
promote documents that use `keyRefFromConfigMap`, liveness probes, etc.
These are the main ones I can think of, but I'm sure there are a *ton* of
ways to achieve this.
If the corpus gets large enough, we might even be able to use *blockchains*,
*machine learning*, and maybe even self-driving cars.
### Add more support for indexing of other k8s/kustomize related data
One thing that jumps to mind is the use of kustomize plugins. They are easy
to track since they all have an unused global variable: `var KustomizePluggin`
it would be easy to run the pluginator command and generate godocs for each
go file with this unique identifier.
For the sake of completeness, here is the full GitHub query that we can use to
find these:
`api.github.com/search/code?q=var+KustomizePlugin+extension%3A.go&access_token=access_token`
Godoc will not show much, since most packages will be using package main, but
using pluginator we can make it a properly named package such that Godoc would
actually generate the relevant documentation.

View File

@@ -1,195 +0,0 @@
package server
import (
"context"
"encoding/json"
"fmt"
"log"
"net/http"
"os"
"strconv"
"strings"
"github.com/gorilla/mux"
"github.com/rs/cors"
"sigs.k8s.io/kustomize/api/internal/crawl/index"
)
type kustomizeSearch struct {
ctx context.Context
// Eventually pIndex *index.PlugginIndex
idx *index.KustomizeIndex
router *mux.Router
log *log.Logger
}
// New server. Creating a server does not launch it. To launch simply:
// srv, _ := NewKustomizeSearch(context.Backgroud())
// err := srv.Serve()
// if err != nil {
// // Handle server issues.
// }
//
// The server has three enpoints, two of which are functional:
//
// /search: processes the ?q= parameter for a text query and
// returns a list of 10 resutls starting from the ?from= value provided,
// with the default being zero.
//
// /metrics: returns overall metrics about the files indexed. Returns
// timeseries data for kustomization files, and returns breakdown of file
// counts by their 'kind' fields
//
// /register: not implemented, but meant as an endpoint for adding new
// kustomization files to the corpus.
func NewKustomizeSearch(ctx context.Context) (*kustomizeSearch, error) {
idx, err := index.NewKustomizeIndex(ctx, "kustomize")
if err != nil {
return nil, err
}
ks := &kustomizeSearch{
ctx: ctx,
idx: idx,
router: mux.NewRouter(),
log: log.New(os.Stdout, "Kustomize server: ",
log.LstdFlags|log.Llongfile|log.LUTC),
}
return ks, nil
}
// Set up common middleware and the routes for the server.
func (ks *kustomizeSearch) routes() {
// Setup middleware.
ks.router.Use(func(handler http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
handler.ServeHTTP(w, r)
})
})
ks.router.HandleFunc("/liveness", ks.liveness()).Methods(http.MethodGet)
ks.router.HandleFunc("/readiness", ks.readiness()).Methods(http.MethodGet)
ks.router.HandleFunc("/search", ks.search()).Methods(http.MethodGet)
ks.router.HandleFunc("/metrics", ks.metrics()).Methods(http.MethodGet)
ks.router.HandleFunc("/register", ks.register()).Methods(http.MethodPost)
}
// Start listening and serving on the provided port.
func (ks *kustomizeSearch) Serve(port int) error {
ks.routes()
handler := cors.Default().Handler(ks.router)
s := &http.Server{
Addr: fmt.Sprintf(":%d", port),
Handler: handler,
// Timeouts/Limits
}
return s.ListenAndServe()
}
// /liveness endpoint
func (ks *kustomizeSearch) liveness() http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
}
}
// /readyness endpoint
func (ks *kustomizeSearch) readiness() http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
opt := index.KustomizeSearchOptions{}
_, err := ks.idx.Search("", opt)
if err != nil {
http.Error(w,
`{ "error": "could not connect to database" }`,
http.StatusInternalServerError)
return
}
w.WriteHeader(http.StatusOK)
}
}
// /register endpoint.
func (ks *kustomizeSearch) register() http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
http.Error(w, "not implemented", http.StatusInternalServerError)
}
}
// /search endpoint.
func (ks *kustomizeSearch) search() http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
values := r.URL.Query()
queries := values["q"]
ks.log.Println("Query: ", values)
var from int
fromParam := values["from"]
if len(fromParam) > 0 {
from, _ = strconv.Atoi(fromParam[0])
if from < 0 {
from = 0
}
}
_, noKinds := values["nokinds"]
opt := index.KustomizeSearchOptions{
SearchOptions: index.SearchOptions{
Size: 10,
From: from,
},
KindAggregation: !noKinds,
}
results, err := ks.idx.Search(strings.Join(queries, " "), opt)
if err != nil {
ks.log.Println("Error: ", err)
http.Error(w, fmt.Sprintf(
`{ "error": "could not complete the query" }`),
http.StatusInternalServerError)
return
}
enc := json.NewEncoder(w)
setIndent(enc)
if err = enc.Encode(results); err != nil {
http.Error(w, `{ "error": "failed to send back results" }`,
http.StatusInternalServerError)
return
}
return
}
}
// metrics endpoint.
func (ks *kustomizeSearch) metrics() http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
res, err := ks.idx.Search("", index.KustomizeSearchOptions{
KindAggregation: true,
TimeseriesAggregation: true,
})
if err != nil {
http.Error(w, `{ "error": "could not perform the search."}`,
http.StatusInternalServerError)
return
}
enc := json.NewEncoder(w)
setIndent(enc)
if err := enc.Encode(res); err != nil {
http.Error(w, `{ "error": "could not format return value" }`,
http.StatusInternalServerError)
return
}
}
}
// make json response human readable.
func setIndent(e *json.Encoder) {
e.SetIndent("", " ")
}

View File

@@ -1,14 +0,0 @@
FROM golang:1.11 AS build
ARG GO111MODULE=on
WORKDIR /go/src/sigs.k8s.io/kustomize/api/internal/crawl
COPY . /go/src/sigs.k8s.io/kustomize/api/internal/crawl
RUN go mod download
RUN CGO_ENABLED=0 go install sigs.k8s.io/kustomize/api/internal/crawl/cmd/backend/
FROM scratch
COPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=build /go/bin/backend /
ENTRYPOINT ["/backend"]

View File

@@ -1,30 +0,0 @@
package main
import (
"context"
"log"
"os"
"strconv"
server "sigs.k8s.io/kustomize/api/internal/crawl/backend"
)
func main() {
portStr := os.Getenv("PORT")
port, err := strconv.Atoi(portStr)
if portStr == "" || err != nil {
log.Fatalf("$PORT(%s) must be set to an integer\n", portStr)
}
ctx := context.Background()
ks, err := server.NewKustomizeSearch(ctx)
if err != nil {
log.Fatalf("Error creating kustomize server: %v", ks)
}
err = ks.Serve(port)
if err != nil {
log.Fatalf("Error while running server: %v", err)
}
}

View File

@@ -1,15 +0,0 @@
FROM golang:1.14 AS build
ARG GO111MODULE=on
WORKDIR /go/src/sigs.k8s.io/kustomize/api/internal/crawl
COPY . /go/src/sigs.k8s.io/kustomize//api/internal/crawl
RUN go mod download
RUN CGO_ENABLED=0 go install -v ./cmd/crawler/crawler.go
FROM scratch
COPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=build /go/bin/crawler /
ENTRYPOINT ["/crawler"]
CMD []

View File

@@ -1,206 +0,0 @@
package main
import (
"context"
"flag"
"fmt"
"log"
"net/http"
"os"
"time"
"sigs.k8s.io/kustomize/api/internal/crawl/utils"
"sigs.k8s.io/kustomize/api/internal/crawl/crawler"
"sigs.k8s.io/kustomize/api/internal/crawl/crawler/github"
"sigs.k8s.io/kustomize/api/internal/crawl/doc"
"sigs.k8s.io/kustomize/api/internal/crawl/httpclient"
"sigs.k8s.io/kustomize/api/internal/crawl/index"
"github.com/gomodule/redigo/redis"
)
const (
githubAccessTokenVar = "GITHUB_ACCESS_TOKEN"
redisCacheURL = "REDIS_CACHE_URL"
redisKeyURL = "REDIS_KEY_URL"
retryCount = 3
)
type CrawlMode int
const (
CrawlUnknown CrawlMode = iota
// Crawl all the kustomization files in all the repositories of a Github user
CrawlUser
// Crawl all the kustomization files in a Github repo
CrawlRepo
// Crawl all the documents in the index
CrawlIndex
// Crawl all the kustomization files on Github
CrawlGithub
// Crawl all the documents in the index and crawling all the kustomization files on Github
CrawlIndexAndGithub
)
func NewCrawlMode(s string) CrawlMode {
switch s {
case "github-user":
return CrawlUser
case "github-repo":
return CrawlRepo
case "index+github":
return CrawlIndexAndGithub
case "index":
return CrawlIndex
case "github":
return CrawlGithub
default:
return CrawlUnknown
}
}
func main() {
indexNamePtr := flag.String(
"index", "kustomize", "The name of the ElasticSearch index.")
modePtr := flag.String("mode", "index+github",
`The crawling mode, which can be one of [github-user, github-repo, index, github, index+github].
* github-user: crawl all the kustomization files in all the repositories of a Github user (--github-user must be specified for this mode).
* github-repo: crawl all the kustomization files in a Github repository (--github-repo must be specified for this mode).
* index: crawl all the documents in the index.
* gihub: crawl all the kustomization files on Github.
* index+github: crawl all the documents in the index and crawling all the kustomization files on Github.`)
githubUserPtr := flag.String("github-user", "",
"A github user name (e.g., kubernetes-sigs). This flag is required for the `github-user` mode.")
githubRepoPtr := flag.String("github-repo", "",
"A github repository name (e.g., kubernetes-sigs/kustomize). This flag is required for the `github-repo` mode.")
flag.Parse()
githubToken := os.Getenv(githubAccessTokenVar)
if githubToken == "" {
log.Printf("Must set the variable '%s' to make github requests.\n",
githubAccessTokenVar)
return
}
ctx := context.Background()
idx, err := index.NewKustomizeIndex(ctx, *indexNamePtr)
if err != nil {
log.Printf("Could not create an index: %v\n", err)
return
}
cacheURL := os.Getenv(redisCacheURL)
cache, err := redis.DialURL(cacheURL)
clientCache := &http.Client{}
if err != nil {
log.Printf("Error: redis could not make a connection: %v\n", err)
} else {
clientCache = httpclient.NewClient(cache)
}
// docConverter takes in a plain document and processes it for the index.
docConverter := func(d *doc.Document) (crawler.CrawledDocument, error) {
kdoc := doc.KustomizationDocument{
Document: *d,
}
err := kdoc.ParseYAML()
return &kdoc, err
}
// Index updates the value in the index.
indexFunc := func(cdoc crawler.CrawledDocument, mode index.Mode) error {
switch d := cdoc.(type) {
case *doc.KustomizationDocument:
switch mode {
case index.Delete:
log.Printf("Deleting: %v", d)
return idx.Delete(d.ID())
default:
log.Printf("Inserting: %v", d)
return idx.Put(d.ID(), d)
}
default:
return fmt.Errorf("type %T not supported", d)
}
}
// seen tracks the IDs of all the documents in the index and their corresponding file types.
// This helps avoid indexing a given document multiple times.
seen := utils.NewSeenMap()
mode := NewCrawlMode(*modePtr)
ghCrawlerConstructor := func(user, repo string) crawler.Crawler {
if user != "" {
return github.NewCrawler(githubToken, retryCount, clientCache,
github.QueryWith(
github.Filename("kustomization.yaml"),
github.Filename("kustomization.yml"),
github.Filename("kustomization"),
github.User(user)),
)
} else if repo != "" {
return github.NewCrawler(githubToken, retryCount, clientCache,
github.QueryWith(
github.Filename("kustomization.yaml"),
github.Filename("kustomization.yml"),
github.Filename("kustomization"),
github.Repo(repo)),
)
} else {
return github.NewCrawler(githubToken, retryCount, clientCache,
github.QueryWith(
github.Filename("kustomization.yaml"),
github.Filename("kustomization.yml"),
github.Filename("kustomization")),
)
}
}
query := []byte(`{ "query":{ "match_all":{} } }`)
it := idx.IterateQuery(query, 10000, 60*time.Second)
switch mode {
case CrawlIndexAndGithub:
crawlers := []crawler.Crawler{ghCrawlerConstructor("", "")}
crawler.CrawlFromSeedIterator(ctx, it, crawlers, docConverter, indexFunc, seen)
crawler.CrawlGithub(ctx, crawlers, docConverter, indexFunc, seen)
case CrawlIndex:
crawlers := []crawler.Crawler{ghCrawlerConstructor("", "")}
crawler.CrawlFromSeedIterator(ctx, it, crawlers, docConverter, indexFunc, seen)
case CrawlGithub:
crawlers := []crawler.Crawler{ghCrawlerConstructor("", "")}
// add all the documents in the index into seen.
// this greatly reduces the time overhead of CrawlGithub.
for it.Next() {
for _, hit := range it.Value().Hits.Hits {
d := hit.Document.Document
seen.Set(d.ID(), d.FileType)
}
}
if err := it.Err(); err != nil {
log.Fatalf("Error iterating the index: %v\n", err)
}
crawler.CrawlGithub(ctx, crawlers, docConverter, indexFunc, seen)
case CrawlUser:
if *githubUserPtr == "" {
flag.Usage()
log.Fatalf("Please specify a github user with the github-user flag!")
}
crawlers := []crawler.Crawler{ghCrawlerConstructor(*githubUserPtr, "")}
crawler.CrawlGithub(ctx, crawlers, docConverter, indexFunc, seen)
case CrawlRepo:
if *githubRepoPtr == "" {
flag.Usage()
log.Fatalf("Please specify a github repository with the github-repo flag!")
}
crawlers := []crawler.Crawler{ghCrawlerConstructor("", *githubRepoPtr)}
crawler.CrawlGithub(ctx, crawlers, docConverter, indexFunc, seen)
case CrawlUnknown:
flag.Usage()
log.Fatalf("The --mode flag must be one of [github-user, github-repo, index, github, index+github].")
}
}

View File

@@ -1,14 +0,0 @@
FROM golang:1.11 AS build
ARG GO111MODULE=on
WORKDIR /go/src/sigs.k8s.io/kustomize/api/internal/crawl
COPY . /go/src/sigs.k8s.io/kustomize/api/internal/crawl
RUN go mod download
RUN CGO_ENABLED=0 go install ./cmd/kustomize_stats
FROM scratch
COPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=build /go/bin/kustomize_stats /
ENTRYPOINT ["/kustomize_stats"]

View File

@@ -1,249 +0,0 @@
package main
import (
"context"
"crypto/sha256"
"flag"
"fmt"
"log"
"sort"
"time"
"sigs.k8s.io/kustomize/api/internal/crawl/doc"
"sigs.k8s.io/kustomize/api/internal/crawl/index"
)
// iterateArr adds each item in arr into countMap.
func iterateArr(arr []string, countMap map[string]int) {
for _, item := range arr {
if _, ok := countMap[item]; !ok {
countMap[item] = 1
} else {
countMap[item]++
}
}
}
// SortMapKeyByValueInt takes a map as its input, sorts its keys according to their values
// in the map, and outputs the sorted keys as a slice.
func SortMapKeyByValueInt(m map[string]int) []string {
keys := make([]string, 0, len(m))
for key := range m {
keys = append(keys, key)
}
// sort keys according to their values in the map m
sort.Slice(keys, func(i, j int) bool { return m[keys[i]] > m[keys[j]] })
return keys
}
// SortMapKeyByValue takes a map as its input, sorts its keys according to their values
// in the map, and outputs the sorted keys as a slice.
func SortMapKeyByValueLen(m map[string][]string) []string {
keys := make([]string, 0, len(m))
for key := range m {
keys = append(keys, key)
}
// sort keys according to their values in the map m
sort.Slice(keys, func(i, j int) bool { return len(m[keys[i]]) > len(m[keys[j]]) })
return keys
}
func GeneratorOrTransformerStats(docs []*doc.KustomizationDocument) {
n := len(docs)
if n == 0 {
return
}
fileType := docs[0].FileType
fmt.Printf("There are totally %d %s files.\n", n, fileType)
GitRepositorySummary(docs, fileType)
// key of kindToUrls: a string in the KustomizationDocument.Kinds field
// value of kindToUrls: a slice of string urls defining a given kind.
kindToUrls := make(map[string][]string)
for _, d := range docs {
url := fmt.Sprintf("%s/blob/%s/%s", d.RepositoryURL, d.DefaultBranch, d.FilePath)
for _, kind := range d.Kinds {
if _, ok := kindToUrls[kind]; !ok {
kindToUrls[kind] = []string{url}
} else {
kindToUrls[kind] = append(kindToUrls[kind], url)
}
}
}
fmt.Printf("There are totally %d kinds of %s\n", len(kindToUrls), fileType)
sortedKeys := SortMapKeyByValueLen(kindToUrls)
for _, k := range sortedKeys {
sort.Strings(kindToUrls[k])
fmt.Printf("%s kind %s appears %d times\n", fileType, k, len(kindToUrls[k]))
for _, url := range kindToUrls[k] {
fmt.Printf("%s\n", url)
}
}
}
// GitRepositorySummary counts the distribution of docs:
// 1) how many git repositories are these docs from?
// 2) how many docs are from each git repository?
func GitRepositorySummary(docs []*doc.KustomizationDocument, fileType string) {
m := make(map[string]int)
for _, d := range docs {
if _, ok := m[d.RepositoryURL]; ok {
m[d.RepositoryURL]++
} else {
m[d.RepositoryURL] = 1
}
}
sortedKeys := SortMapKeyByValueInt(m)
topN := 10
i := 0
for _, k := range sortedKeys {
if i >= topN {
break
}
fmt.Printf("%d %s are from %s\n", m[k], fileType, k)
i++
}
}
func main() {
topKindsPtr := flag.Int(
"kinds", -1,
`the number of kubernetes object kinds to be listed according to their popularities.
By default, all the kinds will be listed.
If you only want to list the 10 most popular kinds, set the flag to 10.`)
topIdentifiersPtr := flag.Int(
"identifiers", -1,
`the number of identifiers to be listed according to their popularities.
By default, all the identifiers will be listed.
If you only want to list the 10 most popular identifiers, set the flag to 10.`)
topKustomizeFeaturesPtr := flag.Int(
"kustomize-features", -1,
`the number of kustomize features to be listed according to their popularities.
By default, all the features will be listed.
If you only want to list the 10 most popular features, set the flag to 10.`)
indexNamePtr := flag.String(
"index", "kustomize", "The name of the ElasticSearch index.")
flag.Parse()
ctx := context.Background()
idx, err := index.NewKustomizeIndex(ctx, *indexNamePtr)
if err != nil {
log.Fatalf("Could not create an index: %v\n", err)
}
// count tracks the number of documents in the index
count := 0
// kustomizationFilecount tracks the number of kustomization files in the index
kustomizationFilecount := 0
kindsMap := make(map[string]int)
identifiersMap := make(map[string]int)
kustomizeIdentifiersMap := make(map[string]int)
// ids tracks the unique IDs of the documents in the index
ids := make(map[string]struct{})
// generatorFiles include all the non-kustomization files whose FileType is generator
generatorFiles := make([]*doc.KustomizationDocument, 0)
// transformersFiles include all the non-kustomization files whose FileType is transformer
transformersFiles := make([]*doc.KustomizationDocument, 0)
checksums := make(map[string]int)
// get all the documents in the index
query := []byte(`{ "query":{ "match_all":{} } }`)
it := idx.IterateQuery(query, 10000, 60*time.Second)
for it.Next() {
for _, hit := range it.Value().Hits.Hits {
sum := fmt.Sprintf("%x", sha256.Sum256([]byte(hit.Document.DocumentData)))
if _, ok := checksums[sum]; ok {
checksums[sum]++
} else {
checksums[sum] = 1
}
// check whether there is any duplicate IDs in the index
if _, ok := ids[hit.ID]; !ok {
ids[hit.ID] = struct{}{}
} else {
log.Printf("Found duplicate ID (%s)\n", hit.ID)
}
count++
iterateArr(hit.Document.Kinds, kindsMap)
iterateArr(hit.Document.Identifiers, identifiersMap)
if doc.IsKustomizationFile(hit.Document.FilePath) {
kustomizationFilecount++
iterateArr(hit.Document.Identifiers, kustomizeIdentifiersMap)
} else {
switch hit.Document.FileType {
case "generator":
generatorFiles = append(generatorFiles, hit.Document.Copy())
case "transformer":
transformersFiles = append(transformersFiles, hit.Document.Copy())
}
}
}
}
if err := it.Err(); err != nil {
log.Fatalf("Error iterating: %v\n", err)
}
sortedKindsMapKeys := SortMapKeyByValueInt(kindsMap)
sortedIdentifiersMapKeys := SortMapKeyByValueInt(identifiersMap)
sortedKustomizeIdentifiersMapKeys := SortMapKeyByValueInt(kustomizeIdentifiersMap)
fmt.Printf(`The count of unique document IDs in the kustomize index: %d
There are %d documents in the kustomize index.
%d kinds of kubernetes objects are customized:`, len(ids), count, len(kindsMap))
fmt.Printf("\n")
kindCount := 0
for _, key := range sortedKindsMapKeys {
if *topKindsPtr < 0 || (*topKindsPtr >= 0 && kindCount < *topKindsPtr) {
fmt.Printf("\tkind `%s` is customimzed in %d documents\n", key, kindsMap[key])
kindCount++
}
}
fmt.Printf("%d kinds of identifiers are found:\n", len(identifiersMap))
identifierCount := 0
for _, key := range sortedIdentifiersMapKeys {
if *topIdentifiersPtr < 0 || (*topIdentifiersPtr >= 0 && identifierCount < *topIdentifiersPtr) {
fmt.Printf("\tidentifier `%s` appears in %d documents\n", key, identifiersMap[key])
identifierCount++
}
}
fmt.Printf(`There are %d kustomization files in the kustomize index.
%d kinds of kustomize features are found:`, kustomizationFilecount, len(kustomizeIdentifiersMap))
fmt.Printf("\n")
kustomizeFeatureCount := 0
for _, key := range sortedKustomizeIdentifiersMapKeys {
if *topKustomizeFeaturesPtr < 0 || (*topKustomizeFeaturesPtr >= 0 && kustomizeFeatureCount < *topKustomizeFeaturesPtr) {
fmt.Printf("\tfeature `%s` is used in %d documents\n", key, kustomizeIdentifiersMap[key])
kustomizeFeatureCount++
}
}
GeneratorOrTransformerStats(generatorFiles)
GeneratorOrTransformerStats(transformersFiles)
fmt.Printf("There are total %d checksums of document contents\n", len(checksums))
sortedChecksums := SortMapKeyByValueInt(checksums)
sortedChecksums = sortedChecksums[:20]
fmt.Printf("The top 20 checksums are:\n")
for _, key := range sortedChecksums {
fmt.Printf("checksum %s apprears %d\n", key, checksums[key])
}
}

View File

@@ -1,8 +0,0 @@
This binary takes as its input a json file including GKE logs (which can be
[exported](https://cloud.google.com/logging/docs/export/configure_export_v2) into
[Cloud Storage](https://cloud.google.com/storage/docs/)),
and extracts the `textPayload` field of each log entry.
Here is an log entry example:
{"insertId":"1sxuh4jg5lw6w10","labels":{"compute.googleapis.com/resource_name":"gke-crawler2-default-pool-5e55ea05-gzgv","container.googleapis.com/namespace_name":"default","container.googleapis.com/pod_name":"kustomize-stats-5bczg","container.googleapis.com/stream":"stdout"},"logName":"projects/haiyanmeng-gke-dev/logs/kustomize-stats","receiveTimestamp":"2020-01-06T23:33:07.012831742Z","resource":{"labels":{"cluster_name":"crawler2","container_name":"kustomize-stats","instance_id":"8183086081854184383","namespace_id":"default","pod_id":"kustomize-stats-5bczg","project_id":"haiyanmeng-gke-dev","zone":"us-central1-a"},"type":"container"},"severity":"INFO","textPayload":"The kustomize index already exists\n","timestamp":"2020-01-06T23:32:46.628930547Z"}

View File

@@ -1,7 +0,0 @@
wget <log-file-url> -O log
go build .
./log-parser log >out
cat out | grep "kind \`" | cut -d\` -f2 | tail -n 50
cat out | grep "kind \`" | awk '{print $6}' | tail -n 50
cat out | grep "feature \`" | grep -v "\`resources\`" | grep -v -e "\`apiVersion\`" | grep -v -e "\`apiversion\`" | cut -d\` -f2
cat out | grep "feature \`" | grep -v "\`resources\`" | grep -v -e "\`apiVersion\`" | grep -v -e "\`apiversion\`" | awk '{print $6}'

View File

@@ -1,49 +0,0 @@
package main
import (
"bufio"
"encoding/json"
"fmt"
"log"
"os"
)
func main() {
if len(os.Args) != 2 {
log.Fatalf("The usage of the command is: \n\t%s <log-file.json>", os.Args[0])
}
file, err := os.Open(os.Args[1])
if err != nil {
log.Fatal(err)
}
closeFile := func(file *os.File) {
if err := file.Close(); err != nil {
log.Fatal(err)
}
}
defer closeFile(file)
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := scanner.Text()
var entry interface{}
if err := json.Unmarshal([]byte(line), &entry); err != nil {
log.Printf("failed to unmarshal a log entry: %s\n", line)
}
m := entry.(map[string]interface{})
if payload, ok := m["textPayload"]; ok {
// use fmt.Printf here instead of log.Printf to avoid the time and code location info the log package provides
fmt.Printf("%s", payload)
} else {
log.Printf("the log entry does not have the `textPayload` field: %s\n", line)
}
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
}

View File

@@ -1,5 +0,0 @@
configmapGenerator:
- name: elasticsearch-config
literals:
- es-url="http://esbasic-master:9200"
- plugin-index-name="plugin"

View File

@@ -1 +0,0 @@
github_api_secret.txt

View File

@@ -1,2 +0,0 @@
<ADD YOUR GITHUB PERSONAL ACCESS TOKEN HERE WITHOUT A TRAILING NEWLINE>
Run: printf "<your-token>" > github_api_secret.txt

View File

@@ -1,15 +0,0 @@
resources:
- ../../base
configmapGenerator:
- name: crawler-http-cache
literals:
- redis-cache-url="redis://redis-http-cache:6379"
- name: redis-keystore
literals:
- keystore-url="redis://redis-docs-keystore:6379"
secretGenerator:
- name: github-access-token
files:
- token=github_api_secret.txt

View File

@@ -1,34 +0,0 @@
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: crawler-cronjob
spec:
# run the cronjob at 00:00 every 7 days
schedule: "0 0 */7 * *"
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: crawler
image: gcr.io/haiyanmeng-gke-dev/crawler:v1
command: ["/crawler"]
args: ["--mode=index+github", "--github-repo=kubernetes-sigs/kustomize", "--index=kustomize"]
imagePullPolicy: Always
env:
- name: GITHUB_ACCESS_TOKEN
valueFrom:
secretKeyRef:
name: github-access-token
key: token
- name: ELASTICSEARCH_URL
valueFrom:
configMapKeyRef:
name: elasticsearch-config
key: es-url
- name: REDIS_URL
valueFrom:
configMapKeyRef:
name: crawler-http-cache
key: redis-cache-url

View File

@@ -1,3 +0,0 @@
resources:
- ../base
- cronjob.yaml

View File

@@ -1,53 +0,0 @@
The crawler job can run in one of the following mode:
# Crawling all the documents in the index and crawling all the kustomization files on Github
This is the default setting of the crawler job. The `command` and `args` field
of the container should be:
```
command: ["/crawler"]
```
Or
```
command: ["/crawler"]
args: ["--mode=index+github"]
```
# Crawling all the documents in the index
The `command` and `args` field of the container should be:
```
command: ["/crawler"]
args: ["--mode=index"]
```
# Crawling all the kustomization files on Github
The `command` and `args` field of the container should be:
```
command: ["/crawler"]
args: ["--mode=github"]
```
# Crawling all the kustomization files in a Github repo
The `command` and `args` field of the container should be like:
```
command: ["/crawler"]
args: ["--mode=github-repo", "--github-repo=kubernetes-sigs/kustomize"]
```
# Crawling all the kustomization files in all the repositories of a Github user
The `command` and `args` field of the container should be like:
```
command: ["/crawler"]
args: ["--github-user", "--github-user=kubernetes-sigs"]
```

View File

@@ -1,35 +0,0 @@
apiVersion: batch/v1
kind: Job
metadata:
name: crawler
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: crawler
image: gcr.io/haiyanmeng-gke-dev/crawler:v1
imagePullPolicy: Always
command: ["/crawler"]
args: ["--mode=github-repo", "--github-repo=kubernetes-sigs/kustomize", "--index=kustomize"]
env:
- name: GITHUB_ACCESS_TOKEN
valueFrom:
secretKeyRef:
name: github-access-token
key: token
- name: ELASTICSEARCH_URL
valueFrom:
configMapKeyRef:
name: elasticsearch-config
key: es-url
- name: REDIS_CACHE_URL
valueFrom:
configMapKeyRef:
name: crawler-http-cache
key: redis-cache-url
- name: REDIS_KEY_URL
valueFrom:
configMapKeyRef:
name: redis-keystore
key: keystore-url

View File

@@ -1,3 +0,0 @@
resources:
- ../base
- job.yaml

View File

@@ -1,20 +0,0 @@
apiVersion: batch/v1
kind: Job
metadata:
name: kustomize-stats
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: kustomize-stats
image: gcr.io/haiyanmeng-gke-dev/kustomize_stats:v1
imagePullPolicy: Always
command: ["/kustomize_stats"]
args: ["--index=kustomize", "--kinds=51", "--identifiers=50", "--kustomize-features=50"]
env:
- name: ELASTICSEARCH_URL
valueFrom:
configMapKeyRef:
name: elasticsearch-config
key: es-url

View File

@@ -1,3 +0,0 @@
resources:
- ../base
- job.yaml

View File

@@ -1,23 +0,0 @@
# ESBackup depends on ESCluster, and is depended by ESSnapshot.
# Creating `esbackup/kustomize-backbup` will create the `kustomize-backup` snapshot repository.
# Deleting `esbackup/kustomize-backbup` will delete the `kustomize-backup` snapshot repository.
# Deleting `esbackup/kustomize-backbup` will NOT delete the snapshots in the `kustomize-backup` snapshot repository, instead makes all the snapshots in the repository inaccessible.
# Deleting `esbackup/kustomize-backbup` will NOT delete the essnapshot objects depending on it, but will cause those essnapshot objects to be reconciled, which update the status of the essnapshot objects to reflect the fact that the esbackup object is missing.
# If you delete the `kustomize-backup` snapshot repository directly without deleting `esbackup/kustomize-backbup`, the ESBackup object will not recreate the snapshot repository.
apiVersion: elasticsearch.cloud.google.com/v1alpha1
kind: ESBackup
metadata:
name: kustomize-backup
spec:
storage:
gcs:
# the bucket must exist for the snapshot repository to be created successfully.
bucket: kustomize-backup
# the path does not need to exist.
# If the path does not exist, the controller will create the folder in the GCS bucket.
# If the path already exists and includes snapshots, these snapshots can be used.
path: kustomize
secret:
name: kustomizesa
escluster:
name: esbasic

View File

@@ -1,51 +0,0 @@
# ESCluster is depended by ESBackup and ESRestore.
apiVersion: elasticsearch.cloud.google.com/v1alpha1
kind: ESCluster
metadata:
name: esbasic
spec:
plugin:
pluginList:
- repository-gcs
- ingest-user-agent
- ingest-geoip
# To set `gcpserviceaccount`,
# First, create and download a GCP service account into a json file, named `sakey.json` following the instruction:
# https://www.elastic.co/guide/en/elasticsearch/plugins/6.5/repository-gcs-usage.html#repository-gcs-using-service-account
# Second, create a secret for the service account using the following command:
# $ kubectl create secret generic kustomizesa --from-file=./sakey.json
gcpserviceaccount:
name: kustomizesa
config:
env:
example: test
nodegroups:
- name: di
replicas: 2
data: true
ingest: true
config:
jvm:
- Djava.net.preferIPv4Stack=true
- Xms2g
- Xmx2g
es:
path.repo: '["/tmp/es_backup_basic"]'
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
es/nodegroup: di
resources:
requests:
memory: 3Gi
limits:
memory: 3Gi
- name: m
replicas: 2
master: true
config:
es:
path.repo: '["/tmp/es_backup_basic"]'

View File

@@ -1,19 +0,0 @@
# ESRestore depends on both ESCluster and ESSnapshot.
# Creating `esrestore/kustomize-restore` will restore the `kuostmize` index in the `kustomize-snapshot` snapshot to a new index named `kusotmize-restore`.
# Deleting `esrestore/kustomize-restore` will not delete the restored index.
# Deleting `esrestore/kustomize-restore` should happen before deleting `essnapshot/kustomize-snapshot`.
# After the restore is complete, if the `kusotmize-restore` index is deleted manually, the ESRestore object will NOT restore the `kustomize` index to it again.
# The correct way of using ESRestore is: create a ESRestore object to restore the index; delete the ESRestore object after the restore is complete.
apiVersion: elasticsearch.cloud.google.com/v1alpha1
kind: ESRestore
metadata:
name: kustomize-restore
spec:
include_global_state: true
ignore_unavailable: true
rename_pattern: kustomize
rename_replacement: kustomize-restore
essnapshot:
name: kustomize-snapshot
escluster:
name: esbasic

View File

@@ -1,23 +0,0 @@
# ESSnapshot depends on ESBackup, and is depended by ESRestore.
# Creating `essnapshot/kustomize-snapshot` will create a snapshot named `kustomize-snapshot` in the `kustomize-backup` snapshot repository.
# After being created, the `kustomize-snapshot` snapshot will not be automatically updated when the `kuostomize` index is updated.
# If you delete `essnapshot/kustomize-snapshot` and recreate it, the new snapshot will capture the current status of the `kustomize` index.
# Deleting `essnapshot/kustomize-snapshot` will delete the snapshot.
# Deleting `essnapshot/kustomize-snapshot` should happen before deleting `esbackup/kustomize-backup`.
# If the `kustomize-snapshot` snapshot is deleted directly without deleting `essnapshot/kustomize-snapshot`, the ESSnapshot object will recreate the snapshot.
# The correct way of using ESSnapshot is: create an ESSnapshot object to create a snapshot, keep the ESSnapshot object until the snapshot is no longer needed.
# To update the snapshot to capture the latest version of the index, you can either:
# 1) delete the snapshot, and wait for the ESSnapshot object to recreate the snapshot;
# 2) delete the ESSnapshot object, and recreate the ESSnapshot object.
apiVersion: elasticsearch.cloud.google.com/v1alpha1
kind: ESSnapshot
metadata:
name: kustomize-snapshot
spec:
# indices are optional. If not specified all indices are selected.
indices:
- kustomize
include_global_state: true
ignore_unavailable: true
esbackup:
name: kustomize-backup

View File

@@ -1,7 +0,0 @@
resources:
- redis.yaml
- service.yaml
commonLabels:
app: redis
tier: document-keystore

View File

@@ -1,37 +0,0 @@
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis-docs-keystore
spec:
serviceName: "redis-docs-keystore"
template:
spec:
containers:
- name: redis
image: redis:5-alpine
imagePullPolicy: Always
args:
- "--save"
- "900"
- "1"
- "--save"
- "30"
- "100"
- "--appendonly"
- "yes"
ports:
- name: redis-docs-port
containerPort: 6379
volumeMounts:
- mountPath: /data
name: redis-docs-keystore-data
restartPolicy: Always
volumeClaimTemplates:
- metadata:
name: redis-docs-keystore-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 4Gi

View File

@@ -1,10 +0,0 @@
apiVersion: v1
kind: Service
metadata:
name: redis-docs-keystore
spec:
clusterIP: None
ports:
- protocol: "TCP"
port: 6379
targetPort: redis-docs-port

View File

@@ -1,7 +0,0 @@
resources:
- redis.yaml
- service.yaml
commonLabels:
app: redis
tier: http-cache

View File

@@ -1,16 +0,0 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-http-cache
spec:
template:
spec:
containers:
- name: redis
image: redis:5-alpine
imagePullPolicy: Always
# see redis.io/topics/lru-cache for other policy options.
args: ["--maxmemory", "1gb", "--maxmemory-policy", "allkeys-lru"]
ports:
- name: http-cache-port
containerPort: 6379

View File

@@ -1,10 +0,0 @@
apiVersion: v1
kind: Service
metadata:
name: redis-http-cache
spec:
clusterIP: None
ports:
- protocol: "TCP"
port: 6379
targetPort: http-cache-port

View File

@@ -1,35 +0,0 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: kustomize-search
spec:
selector:
matchLabels:
app: kustomize-search
tier: backend
replicas: 1
template:
metadata:
labels:
app: kustomize-search
tier: backend
spec:
containers:
- name: kustomize-search
image: gcr.io/kustomize-search/backend:latest
imagePullPolicy: Always
livenessProbe:
httpGet:
path: /liveness
port: backend-port
ports:
- name: backend-port
containerPort: 8080
env:
- name: ELASTICSEARCH_URL
valueFrom:
configMapKeyRef:
name: elasticsearch-config
key: es-url
- name: PORT
value: "8080"

View File

@@ -1,4 +0,0 @@
resources:
- ../../base
- deployment.yaml
- service.yaml

View File

@@ -1,14 +0,0 @@
apiVersion: v1
kind: Service
metadata:
name: kustomize-search
spec:
selector:
app: kustomize-search
tier: backend
ports:
- protocol: "TCP"
port: 80
targetPort: backend-port
type: LoadBalancer
loadBalancerIP: ""

View File

@@ -1,23 +0,0 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: kustomize-search-ui
spec:
selector:
matchLabels:
app: kustomize-search
tier: frontend
replicas: 1
template:
metadata:
labels:
app: kustomize-search
tier: frontend
spec:
containers:
- name: frontend
image: gcr.io/kustomize-search/frontend:latest
imagePullPolicy: Always
ports:
- name: frontend-port
containerPort: 80

View File

@@ -1,4 +0,0 @@
resources:
- ../../base
- deployment.yaml
- service.yaml

View File

@@ -1,14 +0,0 @@
apiVersion: v1
kind: Service
metadata:
name: kustomize-search-ui
spec:
selector:
app: kustomize-search
tier: frontend
ports:
- protocol: "TCP"
port: 80
targetPort: frontend-port
type: LoadBalancer
loadBalancerIP: ""

View File

@@ -1,365 +0,0 @@
// Package crawler provides helper methods and defines an interface for lauching
// source repository crawlers that retrieve files from a source and forwards
// to a channel for indexing and retrieval.
package crawler
import (
"context"
"fmt"
"log"
"os"
"sync"
"sigs.k8s.io/kustomize/api/internal/crawl/utils"
"sigs.k8s.io/kustomize/api/internal/crawl/index"
_ "github.com/gomodule/redigo/redis"
"sigs.k8s.io/kustomize/api/internal/crawl/doc"
)
var (
logger = log.New(os.Stdout, "Crawler: ", log.LstdFlags|log.LUTC|log.Llongfile)
)
// Crawler forwards documents from source repositories to index and store them
// for searching. Each crawler is responsible for querying it's source of
// information, and forwarding files that have not been seen before or that need
// updating.
type Crawler interface {
// Crawl returns when it is done processing. This method does not take
// ownership of the channel. The channel is write only, and it
// designates where the crawler should forward the documents.
Crawl(ctx context.Context, output chan<- CrawledDocument, seen utils.SeenMap) error
// Get the document data given the FilePath, Repo, and Ref/Tag/Branch.
FetchDocument(context.Context, *doc.Document) error
// Write to the document what the created time is.
SetCreated(context.Context, *doc.Document) error
SetDefaultBranch(*doc.Document)
Match(*doc.Document) bool
}
type CrawledDocument interface {
ID() string
GetDocument() *doc.Document
// Get all the Documents directly referred in a Document.
// For a Document representing a non-kustomization file, an empty slice will be returned.
// For a Document representing a kustomization file:
// the `includeResources` parameter determines whether the documents referred in the `resources` field are returned or not;
// the `includeTransformers` parameter determines whether the documents referred in the `transformers` field are returned or not;
// the `includeGenerators` parameter determines whether the documents referred in the `generators` field are returned or not.
GetResources(includeResources, includeTransformers, includeGenerators bool) ([]*doc.Document, error)
WasCached() bool
}
type CrawlSeed []*doc.Document
type IndexFunc func(CrawledDocument, index.Mode) error
type Converter func(*doc.Document) (CrawledDocument, error)
func logIfErr(err error) {
if err == nil {
return
}
logger.Println("error: ", err)
}
func findMatch(d *doc.Document, crawlers []Crawler) Crawler {
for _, crawl := range crawlers {
if crawl.Match(d) {
return crawl
}
}
return nil
}
func addBranches(cdoc CrawledDocument, match Crawler, indx IndexFunc,
seen utils.SeenMap, stack *CrawlSeed) {
seen.Set(cdoc.ID(), cdoc.GetDocument().FileType)
match.SetDefaultBranch(cdoc.GetDocument())
// Insert into index
if err := indx(cdoc, index.InsertOrUpdate); err != nil {
logger.Printf("Failed to insert or update doc(%s): %v",
cdoc.GetDocument().Path(), err)
return
}
deps, err := cdoc.GetResources(true, true, true)
if err != nil {
logger.Println(err)
return
}
for _, dep := range deps {
if seen.Seen(dep.ID()) && seen.Value(dep.ID()) == dep.FileType {
continue
}
*stack = append(*stack, dep)
}
}
func doCrawl(ctx context.Context, docsPtr *CrawlSeed, crawlers []Crawler, conv Converter, indx IndexFunc,
seen utils.SeenMap, stack *CrawlSeed, refreshDoc bool, updateFileType bool) {
UpdatedDocCount := 0
seenDocCount := 0
cachedDocCount := 0
findMatchErrCount := 0
FetchDocumentErrCount := 0
SetCreatedErrCount := 0
convErrCount := 0
deleteDocCount := 0
crawledDocCount := 0
// During the execution of the for loop, more Documents may be added into (*docsPtr).
for len(*docsPtr) > 0 {
// get the last Document in (*docPtr), which will be crawled in this iteration.
tail := (*docsPtr)[len(*docsPtr)-1]
// remove the last Document in (*docPtr)
*docsPtr = (*docsPtr)[:(len(*docsPtr) - 1)]
crawledDocCount++
logger.Printf("Crawling doc %d: %s", crawledDocCount, tail.Path())
if seen.Seen(tail.ID()) {
if !updateFileType || seen.Value(tail.ID()) == tail.FileType {
logger.Printf("this doc has been seen before")
seenDocCount++
continue
}
}
if tail.WasCached() {
logger.Printf("doc(%s) is cached already", tail.Path())
cachedDocCount++
continue
}
match := findMatch(tail, crawlers)
if match == nil {
logIfErr(fmt.Errorf("%v could not match any crawler", tail))
findMatchErrCount++
continue
}
if tail.User == "" {
tail.User = doc.UserName(tail.RepositoryURL)
}
// If the Document represents a kustomization root, FetchDcoument will change
// the `filePath` field of the Document by adding `kustomization.yaml` or
// `kustomization.yml` or `kustomization` into the the field.
// Therefore, it is necessary to add the ID of the Document into seen before
// calling FetchDocument. Otherwise, the binary may enter into an infinite loop
// if a kustomization file points to its kustmozation root in its `resources` or
// `bases` field.
seen.Set(tail.ID(), tail.FileType)
if refreshDoc || tail.DefaultBranch == "" {
match.SetDefaultBranch(tail)
}
if refreshDoc || tail.DocumentData == "" {
if err := match.FetchDocument(ctx, tail); err != nil {
logger.Printf("FetchDocument failed on doc(%s): %v", tail.Path(), err)
FetchDocumentErrCount++
// delete the document from the index
cdoc := &doc.KustomizationDocument{
Document: *tail,
}
seen.Set(cdoc.ID(), tail.FileType)
if err := indx(cdoc, index.Delete); err != nil {
logger.Printf("Failed to delete doc(%s): %v", cdoc.Path(), err)
}
deleteDocCount++
continue
}
}
if refreshDoc || tail.CreationTime == nil {
if err := match.SetCreated(ctx, tail); err != nil {
logger.Printf("SetCreated failed on doc(%s): %v", tail.Path(), err)
SetCreatedErrCount++
}
}
cdoc, err := conv(tail)
// If conv returns an error, cdoc can still be added into the index so that
// cdoc.Document can be searched.
if err != nil {
logger.Printf("conv failed on doc(%s): %v", tail.Path(), err)
convErrCount++
}
UpdatedDocCount++
addBranches(cdoc, match, indx, seen, stack)
}
logger.Printf("Summary of doCrawl:\n")
logger.Printf("\t%d documents were updated\n", UpdatedDocCount)
logger.Printf("\t%d documents were seen by the crawler already and skipped\n", seenDocCount)
logger.Printf("\t%d documents were cached already and skipped\n", cachedDocCount)
logger.Printf("\t%d documents didn't have a matching crawler and skipped\n", findMatchErrCount)
logger.Printf("\t%d documents cannot be fetched, %d out of them are deleted\n",
FetchDocumentErrCount, deleteDocCount)
logger.Printf("\t%d documents cannot update its creation time but still were inserted or updated in the index\n", SetCreatedErrCount)
logger.Printf("\t%d documents cannot be converted but still were inserted or updated in the index\n", convErrCount)
}
// CrawlFromSeedIterator iterates all the documents in the index and call CrawlFromSeed for each document.
func CrawlFromSeedIterator(ctx context.Context, it *index.KustomizeIterator, crawlers []Crawler,
conv Converter, indx IndexFunc, seen utils.SeenMap) {
docCount := 0
for it.Next() {
for _, hit := range it.Value().Hits.Hits {
docCount++
logger.Printf("updating document %d from seed\n", docCount)
singleSeed := CrawlSeed{&(hit.Document.Document)}
CrawlFromSeed(ctx, singleSeed, crawlers, conv, indx, seen)
}
}
if err := it.Err(); err != nil {
log.Fatalf("Error iterating the index: %v\n", err)
}
}
// CrawlFromSeed updates all the documents in seed, and crawls all the new
// documents referred in the seed.
func CrawlFromSeed(ctx context.Context, seed CrawlSeed, crawlers []Crawler,
conv Converter, indx IndexFunc, seen utils.SeenMap) {
// stack tracks the documents directly referred in the seed.
stack := make(CrawlSeed, 0)
// each unique document in seed will be crawled once.
doCrawl(ctx, &seed, crawlers, conv, indx, seen, &stack, true, false)
logger.Printf("crawling %d new documents referred by doc\n", len(stack))
// While crawling each document in stack, the documents directly referred in the document
// will be added into stack.
// After this statement is done, stack will become empty.
doCrawl(ctx, &stack, crawlers, conv, indx, seen, &stack, false, true)
}
// CrawlGithubRunner is a blocking function and only returns once all of the
// crawlers are finished with execution.
//
// This function uses the output channel to forward kustomization documents
// from a list of crawlers. The output is to be consumed by a database/search
// indexer for later retrieval.
//
// The return value is an array of errors in which each index represents the
// index of the crawler that emitted the error. Although the errors themselves
// can be nil, the array will always be exactly the size of the crawlers array.
//
// CrawlGithubRunner takes in a seed, which represents the documents stored in an
// index somewhere. The document data is not required to be populated. If there
// are many documents, this is preferable. The order of iteration over the seed
// is not guaranteed, but the CrawlGithub does guarantee that every element
// from the seed will be processed before any other documents from the
// crawlers.
func CrawlGithubRunner(ctx context.Context, output chan<- CrawledDocument,
crawlers []Crawler, seen utils.SeenMap) []error {
errs := make([]error, len(crawlers))
wg := sync.WaitGroup{}
for i, crawler := range crawlers {
// Crawler implementations get their own channels to prevent a
// crawler from closing the main output channel.
docs := make(chan CrawledDocument)
wg.Add(2)
// Forward all of the documents from this crawler's channel to
// the main output channel.
go func(docs <-chan CrawledDocument) {
defer wg.Done()
for d := range docs {
output <- d
}
}(docs)
// Run this crawler and capture its returned error.
go func(idx int, crawler Crawler,
docs chan<- CrawledDocument) {
defer func() {
wg.Done()
if r := recover(); r != nil {
errs[idx] = fmt.Errorf(
"%+v panicked: %v, additional error %v",
crawler, r, errs[idx],
)
}
}()
defer close(docs)
errs[idx] = crawler.Crawl(ctx, docs, seen)
}(i, crawler, docs) // Copies the index and the crawler
}
wg.Wait()
return errs
}
// CrawlGithub crawls all the kustomization files on Github.
func CrawlGithub(ctx context.Context, crawlers []Crawler, conv Converter,
indx IndexFunc, seen utils.SeenMap) {
// ch is channel where all the crawlers sends the crawled documents to.
ch := make(chan CrawledDocument, 1<<10)
wg := sync.WaitGroup{}
wg.Add(1)
go func() {
defer wg.Done()
docCount := 0
for cdoc := range ch {
docCount++
logger.Printf("Processing doc %d found on Github", docCount)
// all the docs here are kustomization files found by querying Github, and
// their `FileType` fields all should be empty.
if seen.Seen(cdoc.ID()) {
logger.Printf("the doc has been seen before")
continue
}
match := findMatch(cdoc.GetDocument(), crawlers)
if match == nil {
logIfErr(fmt.Errorf(
"%v could not match any crawler", cdoc))
continue
}
// stack tracks the documents directly referred in the document.
stack := make(CrawlSeed, 0)
addBranches(cdoc, match, indx, seen, &stack)
if len(stack) > 0 {
// here the documents referred in a kustomization file are crawled separately,
// to avoid accumulating all the referred documents into a single gigantic
// mem-inentive stack.
logger.Printf("crawling the %d new documents referred in doc %d",
len(stack), docCount)
doCrawl(ctx, &stack, crawlers, conv, indx, seen, &stack, false, true)
}
}
}()
logger.Println("processing the documents found from crawling github")
if errs := CrawlGithubRunner(ctx, ch, crawlers, seen); errs != nil {
for _, err := range errs {
logIfErr(err)
}
}
close(ch)
wg.Wait()
}

View File

@@ -1,343 +0,0 @@
package crawler
import (
"context"
"errors"
"fmt"
"log"
"reflect"
"sort"
"strings"
"sync"
"testing"
"time"
"sigs.k8s.io/kustomize/api/internal/crawl/utils"
"sigs.k8s.io/kustomize/api/internal/crawl/index"
"sigs.k8s.io/kustomize/api/internal/crawl/doc"
"sigs.k8s.io/kustomize/api/konfig"
)
const (
kustomizeRepo = "https://github.com/kubernetes-sigs/kustomize"
)
// Simple crawler that forwards it's list of documents to a provided channel and
// returns it's error to the caller.
type testCrawler struct {
matchPrefix string
err error
docs []doc.KustomizationDocument
lukp map[string]int
}
func (c testCrawler) Match(d *doc.Document) bool {
return d != nil
}
func (c testCrawler) SetDefaultBranch(d *doc.Document) {}
func (c testCrawler) FetchDocument(_ context.Context, d *doc.Document) error {
if i, ok := c.lukp[d.ID()]; ok {
d.DocumentData = c.docs[i].DocumentData
return nil
}
for _, suffix := range konfig.RecognizedKustomizationFileNames() {
savedFilePath := d.FilePath
d.FilePath += "/" + suffix
i, ok := c.lukp[d.ID()]
if !ok {
d.FilePath = savedFilePath
continue
}
d.DocumentData = c.docs[i].DocumentData
return nil
}
return fmt.Errorf("document %v does not exist for matcher: %s",
d, c.matchPrefix)
}
func (c testCrawler) SetCreated(_ context.Context, d *doc.Document) error {
d.CreationTime = &time.Time{}
return nil
}
func newCrawler(matchPrefix string, err error,
docs []doc.KustomizationDocument) testCrawler {
c := testCrawler{
matchPrefix: matchPrefix,
err: err,
docs: docs,
lukp: make(map[string]int),
}
for i, d := range docs {
c.lukp[d.ID()] = i
}
return c
}
// Crawl implements the Crawler interface for testing.
func (c testCrawler) Crawl(_ context.Context,
output chan<- CrawledDocument, _ utils.SeenMap) error {
for i, d := range c.docs {
isResource := true
for _, suffix := range konfig.RecognizedKustomizationFileNames() {
if strings.HasSuffix(d.FilePath, suffix) {
isResource = false
break
}
}
if isResource {
continue
}
output <- &c.docs[i]
}
return c.err
}
// Used to make sure that we're comparing documents in order. This is needed
// since these documents will be sent concurrently.
type sortableDocs []doc.KustomizationDocument
func (s sortableDocs) Less(i, j int) bool {
return s[i].FilePath < s[j].FilePath
}
func (s sortableDocs) Swap(i, j int) {
s[i], s[j] = s[j], s[i]
}
func (s sortableDocs) Len() int {
return len(s)
}
func TestCrawlGithubRunner(t *testing.T) {
log.Println("testing CrawlGithubRunner")
tests := []struct {
tc []Crawler
errs []error
docs sortableDocs
}{
{
tc: []Crawler{
testCrawler{
docs: []doc.KustomizationDocument{
{Document: doc.Document{
FilePath: "crawler1/doc1/kustomization.yaml",
}},
{Document: doc.Document{
FilePath: "crawler1/doc2/kustomization.yaml",
}},
{Document: doc.Document{
FilePath: "crawler1/doc3/kustomization.yaml",
}},
},
},
testCrawler{err: errors.New("crawler2")},
testCrawler{},
testCrawler{
docs: []doc.KustomizationDocument{
{Document: doc.Document{
FilePath: "crawler4/doc1/kustomization.yaml",
}},
{Document: doc.Document{
FilePath: "crawler4/doc2/kustomization.yaml",
}},
},
err: errors.New("crawler4"),
},
},
errs: []error{
nil,
errors.New("crawler2"),
nil,
errors.New("crawler4"),
},
docs: sortableDocs{
{Document: doc.Document{
FilePath: "crawler1/doc1/kustomization.yaml",
}},
{Document: doc.Document{
FilePath: "crawler1/doc2/kustomization.yaml",
}},
{Document: doc.Document{
FilePath: "crawler1/doc3/kustomization.yaml",
}},
{Document: doc.Document{
FilePath: "crawler4/doc1/kustomization.yaml",
}},
{Document: doc.Document{
FilePath: "crawler4/doc2/kustomization.yaml",
}},
},
},
}
for _, test := range tests {
output := make(chan CrawledDocument)
wg := sync.WaitGroup{}
wg.Add(1)
// Run the Crawler runner with a list of crawlers.
go func() {
defer close(output)
defer wg.Done()
seen := utils.NewSeenMap()
errs := CrawlGithubRunner(context.Background(),
output, test.tc, seen)
// Check that errors are returned as they should be.
if !reflect.DeepEqual(errs, test.errs) {
t.Errorf("Expected errs (%v) to equal (%v)",
errs, test.errs)
}
}()
// Iterate over the output channel of Crawler runner.
returned := make(sortableDocs, 0, len(test.docs))
for o := range output {
d, ok := o.(*doc.KustomizationDocument)
if !ok || d == nil {
t.Errorf("%T not expected type (%T)",
o, d)
}
returned = append(returned, *d)
}
// Check that all documents are received.
sort.Sort(returned)
if !reflect.DeepEqual(returned, test.docs) {
t.Errorf("Expected docs (%v) to equal (%v)\n",
returned, test.docs)
}
wg.Wait()
}
}
func TestCrawlFromSeed(t *testing.T) {
log.Println("testing CrawlFromSeed")
tests := []struct {
seed CrawlSeed
matcher string
corpus []doc.KustomizationDocument
}{
{
seed: CrawlSeed{
{
RepositoryURL: kustomizeRepo,
FilePath: "examples/helloWorld/kustomization.yaml",
},
{
RepositoryURL: kustomizeRepo,
FilePath: "examples/other/kustomization.yaml",
},
},
matcher: kustomizeRepo,
corpus: []doc.KustomizationDocument{
// Visited from the seed, will be ignored in the crawl.
{Document: doc.Document{
RepositoryURL: kustomizeRepo,
FilePath: "examples/helloWorld/kustomization.yaml",
DocumentData: `
resources:
- deployment.yaml
`,
}},
// Also visited from the seed as a relative resource.
{Document: doc.Document{
RepositoryURL: kustomizeRepo,
FilePath: "examples/helloWorld/deployment.yaml",
DocumentData: `
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello
`,
}},
// Visited from the seed. Has a remote import.
{Document: doc.Document{
RepositoryURL: kustomizeRepo,
FilePath: "examples/other/kustomization.yaml",
DocumentData: `
resources:
- https://github.com/kubernetes-sigs/kustomize/examples/other/overlay
- service.yaml
`,
}},
// Imported as a base from the seed.
{Document: doc.Document{
RepositoryURL: kustomizeRepo,
FilePath: "examples/other/overlay/kustomization.yaml",
DocumentData: `
resources:
- https://github.com/kubernetes-sigs/kustomize/examples/seedcrawl1
- https://github.com/kubernetes-sigs/kustomize/examples/seedcrawl2
`,
}},
// Imported as a resource from the seed.
{Document: doc.Document{
RepositoryURL: kustomizeRepo,
FilePath: "examples/other/service.yaml",
}},
// Visited from crawling seed.
{Document: doc.Document{
RepositoryURL: kustomizeRepo,
FilePath: "examples/seedcrawl1/kustomization.yml",
}},
// Visited from crawling seed.
{Document: doc.Document{
RepositoryURL: kustomizeRepo,
FilePath: "examples/seedcrawl2/kustomization.yaml",
DocumentData: `
resources:
- ../base
- job.yaml
`,
}},
// Visited from crawling seed.
{Document: doc.Document{
RepositoryURL: kustomizeRepo,
FilePath: "examples/base/kustomization.yml",
}},
// Visited from crawling seed imported as resource.
{Document: doc.Document{
RepositoryURL: kustomizeRepo,
FilePath: "examples/seedcrawl2/job.yaml",
}},
},
},
}
for _, tc := range tests {
cr := newCrawler(tc.matcher, nil, tc.corpus)
visited := make(map[string]int)
CrawlFromSeed(context.Background(), tc.seed, []Crawler{cr},
func(d *doc.Document) (CrawledDocument, error) {
return &doc.KustomizationDocument{
Document: *d,
}, nil
},
func(d CrawledDocument, mode index.Mode) error {
visited[d.ID()]++
return nil
},
utils.NewSeenMap(),
)
if lv, lc := len(visited), len(tc.corpus); lv != lc {
t.Errorf("error: %d of %d documents visited.", lv, lc)
t.Errorf("\nvisited (%v)\nexpected (%v).", visited, cr.lukp)
}
for id, cnt := range visited {
if cnt != 1 {
t.Errorf("%s not visited once (%d)", id, cnt)
}
}
}
}

View File

@@ -1,768 +0,0 @@
// Package github implements the crawler.Crawler interface, getting data
// from the Github search API.
package github
import (
"context"
"encoding/json"
"fmt"
"io/ioutil"
"log"
"math"
"net/http"
"os"
"regexp"
"strconv"
"strings"
"time"
"sigs.k8s.io/kustomize/api/internal/crawl/utils"
"sigs.k8s.io/kustomize/api/internal/crawl/crawler"
"sigs.k8s.io/kustomize/api/internal/crawl/doc"
"sigs.k8s.io/kustomize/api/internal/crawl/httpclient"
"sigs.k8s.io/kustomize/api/internal/git"
"sigs.k8s.io/kustomize/api/konfig"
)
var logger = log.New(os.Stdout, "Github Crawler: ",
log.LstdFlags|log.LUTC|log.Llongfile)
// Implements crawler.Crawler.
type githubCrawler struct {
client GhClient
query Query
// branchMap maps github repositories to their default branches
branchMap map[string]string
}
type GhClient struct {
RequestConfig
retryCount uint64
client *http.Client
accessToken string
}
func NewCrawler(accessToken string, retryCount uint64, client *http.Client,
query Query) githubCrawler {
return githubCrawler{
client: GhClient{
retryCount: retryCount,
client: client,
RequestConfig: RequestConfig{
perPage: githubMaxPageSize,
},
accessToken: accessToken,
},
query: query,
branchMap: map[string]string{},
}
}
func (gc githubCrawler) SetDefaultBranch(d *doc.Document) {
url := gc.client.ReposRequest(d.RepositoryFullName())
defaultBranch, err := gc.client.GetDefaultBranch(url, d.RepositoryURL, gc.branchMap)
if err != nil {
logger.Printf(
"(error: %v) setting default_branch to master\n", err)
defaultBranch = "master"
}
d.DefaultBranch = defaultBranch
gc.branchMap[d.RepositoryURL] = d.DefaultBranch
}
func (gc githubCrawler) DefaultBranch(repo string) string {
return gc.branchMap[repo]
}
// Implements crawler.Crawler.
func (gc githubCrawler) Crawl(ctx context.Context,
output chan<- crawler.CrawledDocument, seen utils.SeenMap) error {
ranges := []RangeWithin{
{
start: uint64(0),
end: githubMaxFileSize,
},
}
errs := make(multiError, 0)
for len(ranges) > 0 {
logger.Printf("Current ranges: %v (len: %d)\n", ranges, len(ranges))
tailRange := ranges[len(ranges)-1]
ranges = ranges[:(len(ranges) - 1)]
reProcessQueryRanges, err := gc.CrawlSingleRange(ctx, output, seen, tailRange.start, tailRange.end)
if err != nil {
errs = append(errs, err)
}
ranges = append(ranges, reProcessQueryRanges...)
}
if len(errs) > 0 {
return errs
}
return nil
}
func (gc githubCrawler) CrawlSingleRange(ctx context.Context,
output chan<- crawler.CrawledDocument, seen utils.SeenMap,
lowerBound, upperBound uint64) ([]RangeWithin, error) {
log.Printf("CrawlSingleRange [%d, %d]", lowerBound, upperBound)
noETagClient := GhClient{
RequestConfig: gc.client.RequestConfig,
client: &http.Client{Timeout: gc.client.client.Timeout},
retryCount: gc.client.retryCount,
accessToken: gc.client.accessToken,
}
var reProcessQueryRanges []RangeWithin
var ranges []string
var err error
// Since Github returns a max of 1000 results per query, we can use
// multiple queries that split the search space into chunks of at most
// 1000 files to get all of the data.
for i := 0; i < 5; i++ {
ranges, err = FindRangesForRepoSearch(newCache(noETagClient, gc.query),
lowerBound, upperBound)
if err == nil {
logger.Printf("FindRangesForRepoSearch succeeded after %d retries", i)
break
} else {
time.Sleep(time.Minute)
}
}
if err != nil {
return reProcessQueryRanges, fmt.Errorf("could not split %v into ranges, %v\n",
gc.query, err)
}
logger.Println("ranges: ", ranges)
// Query each range for files.
errs := make(multiError, 0)
queryResult := RangeQueryResult{}
for _, query := range ranges {
reProcessQuery, rangeResult, err := processQuery(ctx, gc.client, query, output, seen, gc.branchMap)
if err != nil {
errs = append(errs, err)
}
queryResult.Add(rangeResult)
if reProcessQuery {
// if the size of a range is 0, such as [245, 245], and reProcessQuery is true,
// it means that there are more than 1000 results for the query range.
// Reprocessing the query range will not help because the GitHub Search API
// only provides up to 1,000 results for each search.
if RangeSizes(query).Size() == 0 {
logger.Printf("range size is 0 includes more than 1000 results: %s", query)
} else {
reProcessQueryRanges = append(reProcessQueryRanges, RangeSizes(query))
}
}
}
logger.Printf("Summary of Crawl: %s", queryResult.String())
if len(errs) > 0 {
return reProcessQueryRanges, errs
}
return reProcessQueryRanges, nil
}
// FetchDocument first tries to fetch the document with d.FilePath. If it fails,
// it will try to add each string in konfig.RecognizedKustomizationFileNames() to
// d.FilePath, and try to fetch the document again.
func (gc githubCrawler) FetchDocument(_ context.Context, d *doc.Document) error {
repoURL := d.RepositoryURL + "/" + d.FilePath + "?ref=" + d.DefaultBranch
repoSpec, err := git.NewRepoSpecFromUrl(repoURL)
if err != nil {
return fmt.Errorf("invalid repospec: %v", err)
}
url := "https://raw.githubusercontent.com/" + repoSpec.OrgRepo +
"/" + repoSpec.Ref + "/" + repoSpec.Path
handle := func(resp *http.Response, err error, path string) error {
if resp == nil {
return fmt.Errorf("empty http response (url: %s; path: %s), error: %v",
url, path, err)
}
if err == nil && resp.StatusCode == http.StatusOK {
d.IsSame = httpclient.FromCache(resp.Header)
defer CloseResponseBody(resp)
data, err := ioutil.ReadAll(resp.Body)
if err != nil {
return err
}
d.DocumentData = string(data)
d.FilePath = d.FilePath + path
return nil
}
return err
}
resp, errGetRawUserContent := gc.client.GetRawUserContent(url)
if err := handle(resp, errGetRawUserContent, ""); err == nil {
return nil
}
for _, file := range konfig.RecognizedKustomizationFileNames() {
resp, errGetRawUserContent = gc.client.GetRawUserContent(url + "/" + file)
if err = handle(resp, errGetRawUserContent, "/"+file); err == nil {
return nil
}
}
return fmt.Errorf("file not found: %s, error: %v", url, err)
}
func (gc githubCrawler) SetCreated(_ context.Context, d *doc.Document) error {
fs := GhFileSpec{
Path: d.FilePath,
Repository: GitRepository{
FullName: d.RepositoryFullName(),
},
}
creationTime, err := gc.client.GetFileCreationTime(fs)
if err != nil {
return err
}
d.CreationTime = &creationTime
return nil
}
func (gc githubCrawler) Match(d *doc.Document) bool {
url := d.RepositoryURL + "/" + d.FilePath + "?ref=" + "/" +
d.DefaultBranch
repoSpec, err := git.NewRepoSpecFromUrl(url)
if err != nil {
return false
}
return strings.Contains(repoSpec.Host, "github.com")
}
type RangeQueryResult struct {
totalDocCnt uint64
seenDocCnt uint64
newDocCnt uint64
errorCnt uint64
}
func (r *RangeQueryResult) Add(other RangeQueryResult) {
r.totalDocCnt += other.totalDocCnt
r.newDocCnt += other.newDocCnt
r.seenDocCnt += other.seenDocCnt
r.errorCnt += other.errorCnt
}
func (r *RangeQueryResult) String() string {
return fmt.Sprintf("got %d files from API. "+
"%d have been seen before. %d are new and sent to the output channel."+
" %d have kustomizationResultAdapter errors.",
r.totalDocCnt, r.seenDocCnt, r.newDocCnt, r.errorCnt)
}
// processQuery follows all of the pages in a query, and updates/adds the
// documents from the crawl to the datastore/index.
func processQuery(ctx context.Context, gcl GhClient, query string,
output chan<- crawler.CrawledDocument, seen utils.SeenMap,
branchMap map[string]string) (bool, RangeQueryResult, error) {
queryPages := make(chan GhResponseInfo)
go func() {
// Forward the document metadata to the retrieval channel.
// This separation allows for concurrent requests for the code
// search, and the retrieval portions of the API.
err := gcl.ForwardPaginatedQuery(ctx, query, queryPages)
if err != nil {
// TODO(damienr74) handle this error with redis?
logger.Println(err)
}
close(queryPages)
}()
reProcessQuery := false
errs := make(multiError, 0)
result := RangeQueryResult{}
pageID := 1
for page := range queryPages {
if page.Error != nil {
errs = append(errs, page.Error)
continue
}
pageResult := RangeQueryResult{}
for _, file := range page.Parsed.Items {
k, err := kustomizationResultAdapter(gcl, file, seen, branchMap)
if err != nil {
logger.Printf("kustomizationResultAdapter failed: %v", err)
errs = append(errs, err)
pageResult.errorCnt++
}
if k != nil {
pageResult.newDocCnt++
output <- k
} else {
pageResult.seenDocCnt++
}
pageResult.totalDocCnt++
}
logger.Printf("processQuery [TotalCount %d - page %d]: %s",
page.Parsed.TotalCount, pageID, pageResult.String())
result.Add(pageResult)
pageID++
if page.Parsed.TotalCount > githubMaxResultsPerQuery {
reProcessQuery = true
}
}
logger.Printf("Summary of processQuery: %s", result.String())
return reProcessQuery, result, errs
}
func kustomizationResultAdapter(gcl GhClient, k GhFileSpec, seen utils.SeenMap,
branchMap map[string]string) (crawler.CrawledDocument, error) {
url := gcl.ReposRequest(k.Repository.FullName)
defaultBranch, err := gcl.GetDefaultBranch(url, k.Repository.URL, branchMap)
if err != nil {
logger.Printf(
"(error: %v) setting default_branch to master\n", err)
defaultBranch = "master"
}
// document here is a kustomization file found by querying Github, whose
// `FileType` field should be empty.
document := doc.Document{
FilePath: k.Path,
DefaultBranch: defaultBranch,
RepositoryURL: k.Repository.URL,
User: doc.UserName(k.Repository.URL),
}
if seen.Seen(document.ID()) {
return nil, nil
}
data, err := gcl.GetFileData(k)
if err != nil {
return nil, err
}
d := doc.KustomizationDocument{
Document: doc.Document{
DocumentData: string(data),
FilePath: k.Path,
DefaultBranch: defaultBranch,
RepositoryURL: k.Repository.URL,
User: doc.UserName(k.Repository.URL),
},
}
creationTime, err := gcl.GetFileCreationTime(k)
if err != nil {
logger.Printf("GetFileCreationTime failed: %v", err)
return &d, err
}
d.CreationTime = &creationTime
if err := d.ParseYAML(); err != nil {
logger.Printf("ParseYAML failed: %v", err)
return &d, err
}
return &d, nil
}
// ForwardPaginatedQuery follows the links to the next pages and performs all of
// the queries for a given search query, relaying the data from each request
// back to an output channel.
func (gcl GhClient) ForwardPaginatedQuery(ctx context.Context, query string,
output chan<- GhResponseInfo) error {
logger.Println("querying: ", query)
response := gcl.parseGithubResponseWithRetry(query)
if response.Error != nil {
return response.Error
}
output <- response
for response.LastURL != "" && response.NextURL != "" {
select {
case <-ctx.Done():
return nil
default:
response = gcl.parseGithubResponseWithRetry(response.NextURL)
if response.Error != nil {
return response.Error
}
output <- response
}
}
return nil
}
// GetFileData gets the bytes from a file.
func (gcl GhClient) GetFileData(k GhFileSpec) ([]byte, error) {
url := gcl.ContentsRequest(k.Repository.FullName, k.Path)
resp, err := gcl.GetReposData(url)
if err != nil {
return nil, fmt.Errorf("%+v: could not get '%s' metadata: %v",
k, url, err)
}
data, err := ioutil.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("%+v: could not read '%s' metadata: %v",
k, url, err)
}
if err := resp.Body.Close(); err != nil {
return nil, err
}
type githubContentRawURL struct {
DownloadURL string `json:"download_url,omitempty"`
}
var rawURL githubContentRawURL
err = json.Unmarshal(data, &rawURL)
if err != nil {
return nil, fmt.Errorf(
"%+v: could not get 'download_url' from '%s' response: %v",
k, data, err)
}
resp, err = gcl.GetRawUserContent(rawURL.DownloadURL)
if err != nil {
return nil, fmt.Errorf("%+v: could not fetch file raw data '%s': %v",
k, rawURL.DownloadURL, err)
}
defer CloseResponseBody(resp)
data, err = ioutil.ReadAll(resp.Body)
return data, err
}
func CloseResponseBody(resp *http.Response) {
if err := resp.Body.Close(); err != nil {
log.Printf("failed to close response body: %v", err)
}
}
// GetDefaultBranch gets the default branch of a github repository.
// m is a map which maps a github repository to its default branch.
// If repo is already in m, the default branch for url will be obtained from m;
// otherwise, a query will be made to github to obtain the default branch.
func (gcl GhClient) GetDefaultBranch(url, repo string, m map[string]string) (string, error) {
if v, ok := m[repo]; ok {
return v, nil
}
resp, err := gcl.GetReposData(url)
if err != nil {
return "", fmt.Errorf(
"'%s' could not get default_branch: %v", url, err)
}
defer CloseResponseBody(resp)
data, err := ioutil.ReadAll(resp.Body)
if err != nil {
return "", fmt.Errorf(
"could not read default_branch: %v", err)
}
type defaultBranch struct {
DefaultBranch string `json:"default_branch,omitempty"`
}
var branch defaultBranch
err = json.Unmarshal(data, &branch)
if err != nil {
return "", fmt.Errorf(
"default_branch json malformed: %v", err)
}
return branch.DefaultBranch, nil
}
// GetFileCreationTime gets the earliest date of a file.
func (gcl GhClient) GetFileCreationTime(
k GhFileSpec) (time.Time, error) {
url := gcl.CommitsRequest(k.Repository.FullName, k.Path)
defaultTime := time.Now()
resp, err := gcl.GetReposData(url)
if err != nil {
return defaultTime, fmt.Errorf(
"%+v: '%s' could not get metadata: %v", k, url, err)
}
type DateSpec struct {
Commit struct {
Author struct {
Date string `json:"date,omitempty"`
} `json:"author,omitempty"`
} `json:"commit,omitempty"`
}
_, lastURL := parseGithubLinkFormat(resp.Header.Get("link"))
if lastURL != "" {
resp, err = gcl.GetReposData(lastURL)
if err != nil {
return defaultTime, fmt.Errorf(
"%+v: '%s' could not get metadata: %v",
k, lastURL, err)
}
}
defer CloseResponseBody(resp)
data, err := ioutil.ReadAll(resp.Body)
if err != nil {
return defaultTime, fmt.Errorf(
"%+v: failed to read metadata: %v", k, err)
}
var earliestDate []DateSpec
err = json.Unmarshal(data, &earliestDate)
size := len(earliestDate)
if err != nil || size == 0 {
return defaultTime, fmt.Errorf(
"%+v: server response '%s' not in expected format: %v",
k, data, err)
}
return time.Parse(time.RFC3339, earliestDate[size-1].Commit.Author.Date)
}
// TODO(damienr74) change the tickers to actually check api rate limits, reset
// times, and throttle requests dynamically based off of current utilization,
// instead of hardcoding the documented values, these calls are not quota'd.
// This is now especially important, since caching the API requests will reduce
// API quota use (so we can actually make more requests in the allotted time
// period).
//
// See https://developer.github.com/v3/rate_limit/ for details.
var (
searchRateTicker = time.NewTicker(time.Second * 2)
contentRateTicker = time.NewTicker(time.Second * 1)
)
func throttleSearchAPI() {
<-searchRateTicker.C
}
func throttleRepoAPI() {
<-contentRateTicker.C
}
type multiError []error
func (e multiError) Error() string {
size := len(e) + 2
strs := make([]string, size)
strs[0] = "Errors ["
for i, err := range e {
strs[i+1] = "\t" + err.Error()
}
strs[size-1] = "]"
return strings.Join(strs, "\n")
}
type GitRepository struct {
API string `json:"url,omitempty"`
URL string `json:"html_url,omitempty"`
FullName string `json:"full_name,omitempty"`
}
type GhFileSpec struct {
Path string `json:"path,omitempty"`
Repository GitRepository `json:"repository,omitempty"`
}
type githubResponse struct {
// MaxUint is reserved as a sentinel value.
// This is the number of files that match the query.
TotalCount uint64 `json:"total_count,omitempty"`
IncompleteResults bool `json:"incomplete_results,omitempty"`
// Github representation of a file.
Items []GhFileSpec `json:"items,omitempty"`
}
type GhResponseInfo struct {
*http.Response
Parsed *githubResponse
Error error
NextURL string
LastURL string
}
func parseGithubLinkFormat(links string) (string, string) {
const (
linkNext = "next"
linkLast = "last"
linkInfoURL = 1
linkInfoRel = 2
)
next, last := "", ""
linkInfo := regexp.MustCompile(`<(.*)>.*; rel="(last|next)"`)
for _, link := range strings.Split(links, ",") {
linkParse := linkInfo.FindStringSubmatch(link)
if len(linkParse) != 3 {
continue
}
url := linkParse[linkInfoURL]
switch linkParse[linkInfoRel] {
case linkNext:
next = url
case linkLast:
last = url
default:
}
}
return next, last
}
func (gcl GhClient) parseGithubResponseWithRetry(getRequest string) GhResponseInfo {
resp := gcl.parseGithubResponse(getRequest)
retries := 0
for resp.Parsed.IncompleteResults {
resp = gcl.parseGithubResponse(getRequest)
retries++
}
log.Printf("The result of query(%s) is complete after %d retries", getRequest, retries)
return resp
}
func (gcl GhClient) parseGithubResponse(getRequest string) GhResponseInfo {
resp, err := gcl.SearchGithubAPI(getRequest)
requestInfo := GhResponseInfo{
Response: resp,
Error: err,
Parsed: nil,
}
if err != nil || resp == nil {
return requestInfo
}
var data []byte
defer CloseResponseBody(resp)
data, requestInfo.Error = ioutil.ReadAll(resp.Body)
if requestInfo.Error != nil {
return requestInfo
}
if resp.StatusCode != http.StatusOK {
logger.Println("query: ", getRequest)
logger.Println("status not OK at the source")
logger.Println("header dump", resp.Header)
logger.Println("body dump", string(data))
requestInfo.Error = fmt.Errorf("request rejected, status '%s'",
resp.Status)
return requestInfo
}
requestInfo.NextURL, requestInfo.LastURL =
parseGithubLinkFormat(resp.Header.Get("link"))
resultCount := githubResponse{
TotalCount: math.MaxUint64,
}
requestInfo.Error = json.Unmarshal(data, &resultCount)
if requestInfo.Error != nil {
return requestInfo
}
requestInfo.Parsed = &resultCount
return requestInfo
}
// SearchGithubAPI performs a search query and handles rate limitting for
// the 'search/code?' endpoint as well as timed retries in the case of abuse
// prevention.
func (gcl GhClient) SearchGithubAPI(query string) (*http.Response, error) {
throttleSearchAPI()
return gcl.getWithRetry(query)
}
// GetReposData performs a search query and handles rate limitting for
// the '/repos' endpoint as well as timed retries in the case of abuse
// prevention.
func (gcl GhClient) GetReposData(query string) (*http.Response, error) {
throttleRepoAPI()
return gcl.getWithRetry(query)
}
// User content (file contents) is not API rate limited, so there's no use in
// throttling this call.
func (gcl GhClient) GetRawUserContent(query string) (*http.Response, error) {
return gcl.getWithRetry(query)
}
func (gcl GhClient) Do(query string) (*http.Response, error) {
req, err := http.NewRequest("GET", query, nil)
if err != nil {
return nil, err
}
req.Header.Add("Authorization", fmt.Sprintf("token %s", gcl.accessToken))
// gcl.client.Do: a non-2xx status code doesn't cause an error.
// See https://golang.org/pkg/net/http/#Client.Do for more info.
resp, err := gcl.client.Do(req)
if resp != nil && resp.StatusCode != http.StatusOK {
err = fmt.Errorf("GhClient.Do(%s) failed with response code: %d",
query, resp.StatusCode)
}
return resp, err
}
func (gcl GhClient) getWithRetry(
query string) (resp *http.Response, err error) {
resp, err = gcl.Do(query)
retryCount := gcl.retryCount
for resp != nil && resp.StatusCode == http.StatusForbidden && retryCount > 0 {
retryTime := resp.Header.Get("Retry-After")
i, errAtoi := strconv.Atoi(retryTime)
if errAtoi != nil {
return resp, fmt.Errorf(
"query '%s' forbidden without 'Retry-After'", query)
}
logger.Printf(
"status forbidden, retring %d more times\n", retryCount)
logger.Printf("waiting %d seconds before retrying\n", i)
time.Sleep(time.Second * time.Duration(i))
retryCount--
resp, err = gcl.Do(query)
}
if err != nil {
return resp, fmt.Errorf("query '%s' could not be processed, %v",
query, err)
}
return resp, err
}

View File

@@ -1,231 +0,0 @@
package github
import (
"fmt"
"net/url"
"strings"
)
const (
perPageArg = "per_page"
)
const githubMaxPageSize = 100
// Implementation detail, not important to external API.
type queryField struct {
name string
value interface{}
}
// Formats a query field.
func (qf queryField) String() string {
var value string
switch v := qf.value.(type) {
case string:
value = v
case rangeFormatter:
value = v.RangeString()
default:
value = fmt.Sprint(v)
}
if qf.name == "" {
return value
}
return fmt.Sprint(qf.name, ":", value)
}
// Example of formating a query:
// QueryWith(
// Filename("kustomization.yaml"),
// Filesize(RangeWithin{64, 192}),
// Keyword("copyright"),
// Keyword("2019"),
// ).String()
//
// Outputs "q=filename:kustomization.yaml+size:64..192+copyright+2018" which
// would search for files that have [64, 192] bytes (inclusive range) and that
// contain the keywords 'copyright' and '2019' somewhere in the file.
type Query []queryField
func QueryWith(qfs ...queryField) Query {
return qfs
}
func (q Query) String() string {
strs := make([]string, 0, len(q))
for _, elem := range q {
str := elem.String()
if str == "" {
continue
}
strs = append(strs, str)
}
query := strings.Join(strs, "+")
if query == "" {
return query
}
return "q=" + query
}
// Keyword takes a single word, and formats it according to the Github API.
func Keyword(k string) queryField {
return queryField{value: k}
}
// Filesize takes a rangeFormatter and formats it according to the Github API.
func Filesize(r rangeFormatter) queryField {
return queryField{name: "size", value: r}
}
// Filename takes a filename and formats it according to the Github API.
func Filename(f string) queryField {
return queryField{name: "filename", value: f}
}
// Path takes a filepath and formats it according to the Github API.
func Path(p string) queryField {
return queryField{name: "path", value: p}
}
// Repo takes a repository (i.e., kubernetes-sigs/kustomize) and formats
// it according to the Github API.
func Repo(r string) queryField {
return queryField{name: "repo", value: r}
}
// Path takes a github username and formats it according to the Github API.
func User(u string) queryField {
return queryField{name: "user", value: u}
}
// RequestConfig stores common variables that must be present for the queries.
// - CodeSearchRequests: ask Github to check the code indices given a query.
// - ContentsRequests: ask Github where to download a resource given a repo and a
// file path.
// - CommitsRequests: asks Github to list commits made one a file. Useful to
// determine the date of a file.
type RequestConfig struct {
perPage uint64
}
// CodeSearchRequestWith given a list of query parameters that specify the
// (patial) query, returns a request object with the (parital) query. Must call
// the URL method to get the string value of the URL. See request.CopyWith, to
// understand why the request object is useful.
func (rc RequestConfig) CodeSearchRequestWith(query Query) request {
vals := url.Values{
"sort": []string{"indexed"},
"order": []string{"desc"},
}
req := rc.makeRequest("search/code", query, vals)
return req
}
// ContentsRequest given the repo name, and the filepath returns a formatted
// query for the Github API to find the dowload information of this filepath.
func (rc RequestConfig) ContentsRequest(fullRepoName, path string) string {
uri := fmt.Sprintf("repos/%s/contents/%s", fullRepoName, path)
return rc.makeRequest(uri, Query{}, url.Values{}).URL()
}
func (rc RequestConfig) ReposRequest(fullRepoName string) string {
uri := fmt.Sprintf("repos/%s", fullRepoName)
return rc.makeRequest(uri, Query{}, url.Values{}).URL()
}
// CommitsRequest given the repo name, and a filepath returns a formatted query
// for the Github API to find the commits that affect this file.
func (rc RequestConfig) CommitsRequest(fullRepoName, path string) string {
uri := fmt.Sprintf("repos/%s/commits", fullRepoName)
vals := url.Values{
"path": []string{path},
}
return rc.makeRequest(uri, Query{}, vals).URL()
}
func (rc RequestConfig) makeRequest(path string, query Query, vals url.Values) request {
vals.Set(perPageArg, fmt.Sprint(rc.perPage))
return request{
url: url.URL{
Scheme: "https",
Host: "api.github.com",
Path: path,
},
vals: vals,
query: query,
}
}
type request struct {
url url.URL
vals url.Values
query Query
}
// CopyWith copies the requests and adds the extra query parameters. It is useful
// for dynamically adding sizes to a filename only query without modifying it.
func (r request) CopyWith(queryParams ...queryField) request {
cpy := r
cpy.query = append(cpy.query, queryParams...)
return cpy
}
// URL encodes the variables and the URL representation into a string.
func (r request) URL() string {
// Github does not handle URL encoding properly in its API for the
// q='...', so the query parameter is added without any encoding
// manually.
encoded := r.vals.Encode()
query := r.query.String()
sep := "&"
if query == "" {
sep = ""
}
if encoded == "" && query != "" {
sep = "?"
}
r.url.RawQuery = query + sep + encoded
return r.url.String()
}
// Allows to define a range of numbers and print it in the github range
// query format https://help.github.com/en/articles/understanding-the-search-syntax.
type rangeFormatter interface {
RangeString() string
}
// RangeLessThan is a range of values strictly less than (<) size.
type RangeLessThan struct {
size uint64
}
func (r RangeLessThan) RangeString() string {
return fmt.Sprintf("<%d", r.size)
}
// RangeLessThan is a range of values strictly greater than (>) size.
type RangeGreaterThan struct {
size uint64
}
func (r RangeGreaterThan) RangeString() string {
return fmt.Sprintf(">%d", r.size)
}
// RangeWithin is an inclusive range from start to end.
type RangeWithin struct {
start uint64
end uint64
}
func (r RangeWithin) RangeString() string {
return fmt.Sprintf("%d..%d", r.start, r.end)
}
func (r RangeWithin) Size() uint64 {
return r.end - r.start
}

View File

@@ -1,140 +0,0 @@
package github
import (
"testing"
)
func TestQueryFields(t *testing.T) {
testCases := []struct {
formatter queryField
expected string
}{
{
formatter: Keyword("keyword"),
expected: "keyword",
},
{
formatter: Filesize(RangeLessThan{23}),
expected: "size:<23",
},
{
formatter: Filesize(RangeWithin{24, 64}),
expected: "size:24..64",
},
{
formatter: Filesize(RangeGreaterThan{64}),
expected: "size:>64",
},
{
formatter: Path("some/path/to/file"),
expected: "path:some/path/to/file",
},
{
formatter: Filename("kustomization.yaml"),
expected: "filename:kustomization.yaml",
},
}
for _, test := range testCases {
if result := test.formatter.String(); result != test.expected {
t.Errorf("got (%#v = %s), expected %s", test.formatter, result, test.expected)
}
}
}
func TestQueryType(t *testing.T) {
testCases := []struct {
query Query
expected string
}{
{
query: QueryWith(
Filesize(RangeWithin{24, 64}),
Filename("kustomization.yaml"),
Keyword("keyword1"),
Keyword("keyword2"),
Repo("user1/repo1"),
User("user1"),
),
expected: "q=size:24..64+filename:kustomization.yaml+keyword1+keyword2+" +
"repo:user1/repo1+user:user1",
},
}
for _, test := range testCases {
if queryStr := test.query.String(); queryStr != test.expected {
t.Errorf("got (%#v = %s), expected %s", test.query, queryStr, test.expected)
}
}
}
func TestGithubSearchQuery(t *testing.T) {
const (
perPage = 100
)
testCases := []struct {
rc RequestConfig
codeQuery Query
fullRepoName string
path string
expectedCodeQuery string
expectedContentsQuery string
expectedCommitsQuery string
}{
{
rc: RequestConfig{
perPage: perPage,
},
codeQuery: Query{
Filename("kustomization.yaml"),
Filesize(RangeWithin{64, 128}),
},
fullRepoName: "kubernetes-sigs/kustomize",
path: "examples/helloWorld/kustomization.yaml",
expectedCodeQuery: "https://api.github.com/search/code?" +
"q=filename:kustomization.yaml+size:64..128&order=desc&per_page=100&sort=indexed",
expectedContentsQuery: "https://api.github.com/repos/kubernetes-sigs/kustomize/contents/" +
"examples/helloWorld/kustomization.yaml?per_page=100",
expectedCommitsQuery: "https://api.github.com/repos/kubernetes-sigs/kustomize/commits?" +
"path=examples%2FhelloWorld%2Fkustomization.yaml&per_page=100",
},
{
rc: RequestConfig{
perPage: perPage,
},
codeQuery: Query{
Filename("kustomization.yaml"),
Filesize(RangeWithin{64, 128}),
},
fullRepoName: "kubernetes-sigs/kustomize",
path: "examples 1/helloWorld/kustomization.yaml",
expectedCodeQuery: "https://api.github.com/search/code?" +
"q=filename:kustomization.yaml+size:64..128&order=desc&per_page=100&sort=indexed",
expectedContentsQuery: "https://api.github.com/repos/kubernetes-sigs/kustomize/contents/" +
"examples%201/helloWorld/kustomization.yaml?per_page=100",
expectedCommitsQuery: "https://api.github.com/repos/kubernetes-sigs/kustomize/commits?" +
"path=examples+1%2FhelloWorld%2Fkustomization.yaml&per_page=100",
},
}
for _, test := range testCases {
if result := test.rc.CodeSearchRequestWith(test.codeQuery).URL(); result != test.expectedCodeQuery {
t.Errorf("Got code query: %s, expected %s", result, test.expectedCodeQuery)
}
if result := test.rc.ContentsRequest(test.fullRepoName, test.path); result != test.expectedContentsQuery {
t.Errorf("Got contents query: %s, expected %s", result, test.expectedContentsQuery)
}
if result := test.rc.CommitsRequest(test.fullRepoName, test.path); result != test.expectedCommitsQuery {
t.Errorf("Got commits query: %s, expected %s", result, test.expectedCommitsQuery)
}
}
}

View File

@@ -1,379 +0,0 @@
package github
// GitHub only returns at most 1000 results per search query,
// this is problematic if you want to retrieve all the results for a given
// search query. However, GitHub allows you to specify as much as you want per
// query to make things more specific. Specifically for files, GitHub allows
// you to specify their sizes with range queries. This is very convenient
// since it allows us to split the search into disjoint sets/shards of results
// from the different file size ranges.
//
// Some important factors to consider:
//
// - These queries are rate limited by the API to roughly once query every two
// seconds.
//
// - The search space for file sizes is in bytes, from 0B to < 512KiB (this is
// a huge search space that cannot be probed linearly in a timely manner if
// granularity is to be expected).
//
// - If you have K files there will likely be ~K/1000 sets that you have find
// from this search space in order to get all of the results.
//
// - If you have O(K) sets it is unlikely that they are all of the same size,
// since (most files are power law distributed). That means that the range
// might be significantly smaller for 1000 small files, than it is for
// 1000 large files.
//
// - This method is a best effort approach. There are some limitations to what
// it can and can't do, so please note the following:
//
// + There may very well be a filesize that has more than 1000 results.
// this method cannot help in this case. However, requerying over time
// (days/weeks/months) while sorting by last indexed values may be
// sufficient to eventually get all of the results.
//
// + It's possible that the github API returns inconsistent counts. This
// is problematic in most cases, since it can cause many issues if the
// case is not handled properly. For instance, if you requested the
// number of files of an interval from size:0..64 and get that there
// are 900 results, you may query at size:0..96 and get that there
// are 800 results. To guarantee that this approach completes and does
// not get into a query loop over the same intervals, it will retry a few
// times and take the largest of the results or the largest previously
// queried value from another range (in this case, the implementation
// could decide that size:0..96 must have 900) results. This makes the
// approach best effort even if there are no single file sizes of over
// 1000 results.
//
//
// The approach that was taken to solve this problem is the following:
//
// 1. Determine the total number of results by querying from the lower bound
// to the upper bound (size:0..max). If there are less than 1000 files,
// return a single range of values (size:0..max) since all results can be
// retrieved.
//
// 2. Otherwise, set a target number of files to be 1000.
//
// 3. Binary search for the range from 0..r that provides a file count that is
// less than or equal to the target. Once this value is found, store the
// upper bound of range (r). If r is the same as the previous value, (or 0)
// increase r by one (this guarantees progress, but will miss out on some
// results).
//
// 4. Increase the target by 1000.
//
// 5. Repeat steps 3 and 4 until the target is at or exceeds the total number
// of files.
//
//
// In general there are other ways to get all of the files from GitHub. In
// some cases it would be sufficient to just get the files that are being
// updated/indexed by github periodically to update the corpus, so this
// complicated approach does not have to be run every time. However, for
// some searches, there may be too many results on a time interval to do
// this simple update search limited to only 1000 results.
//
// There is also a more sophisticated approach that may yield better
// performance:
// - Perform this search once and create a prior distribution of file sizes.
// Each time you want to retrieve the results of the query, scale the
// prior of expected ranges to the current number of files. From each
// expected range of 1000 files, perform a exponential search to find the
// lower bound of the range. This would likely reduce the total number
// of queries by a significant amount since it would only have to search
// for a small set of values around each likely range boundary.
//
// However, actually retrieving the files will be the bottleneck operation
// since the number of queries to find the ranges will be close to:
// log2(maxFileSize) * totalResults / 1000 ~= totalResults / 50
// whereas the number of queries to actually get all of the search results
// are close to:
// apiCallsPerResult * 10(pages) * 100(resultsPerPage) * totalResults / 1000
// = apiCallsPerResult * totalResults.
//
// So it could very well take apiCallsPerResult * 50 times longer to actually
// fetch the results (assuming the quotas for the API calls are the same as the
// search API), than it does to perform these range searches.
import (
"fmt"
"math/bits"
"strconv"
"strings"
)
// Files cannot be more than 2^19 bytes, according to
// https://help.github.com/en/articles/searching-code#considerations-for-code-search
const (
githubMaxFileSize = uint64(1 << 19)
githubMaxResultsPerQuery = uint64(1000)
)
// Interface instead of struct for testing purposes.
// Not expecting to have multiple implementations.
type cachedSearch interface {
CountResults(uint64, uint64) (uint64, error)
RequestString(filesize rangeFormatter) string
}
// cachedSearch is a simple data structure that maps the upper bound (r) of a
// range from 0 to r to the number of files that have between 0 and r files
// (inclusive). It also guarantees that the counts are monotonically increasing
// (not strict) as the value for r increases, by looking at the maximal
// previous file count for the value that precedes r in the cache.
//
// It uses a bit trick to be more efficient in detecting
// inconsistencies in the returned data from the Github API.
// Therefore, the cache expects a search to always start at 0, and
// it expects the max file size to be a power of 2. If this is to be changed
// there are a few considerations to keep in mind:
//
// 1. The cache is only efficient if the queries can be reused, so if
// the first chunk of files lives in the range 0..x, continuing the
// search for the next chunk from x+1..max (while asymptotically sane)
// may actually be less efficient since the cache is essentially reset
// at every interval. This leads to a larger number of requests in
// practice, and requests are what's expensive (rate limits).
//
// 2. The github API is not perfectly monotonic.. (this is somewhat
// problematic). The current cache implementation looks at the
// predecessor entry to find out if the current value is monotonic.
// This is where the bit trick is used, since each step in the binary
// search is adding or omitting to add a decreasing power of 2 to the query
// value, we can remove the least significant set bit to find the
// predecessor in constant time. Ultimately since the search is rate
// limited, we could also easily afford to compute this in linear time
// by iterating over cached values. So this trick is not crucial to the
// cache's performance.
type githubCachedSearch struct {
cache map[uint64]uint64
gcl GhClient
baseRequest request
}
func newCache(client GhClient, query Query) githubCachedSearch {
return githubCachedSearch{
cache: map[uint64]uint64{
0: 0,
},
gcl: client,
baseRequest: client.CodeSearchRequestWith(query),
}
}
func (c githubCachedSearch) CountResults(lowerBound, upperBound uint64) (uint64, error) {
count, cached := c.cache[upperBound]
if cached {
return count, nil
}
sizeRange := RangeWithin{lowerBound, upperBound}
rangeRequest := c.RequestString(sizeRange)
result := c.gcl.parseGithubResponseWithRetry(rangeRequest)
if result.Error != nil {
return count, result.Error
}
// As range search uses powers of 2 for binary search, the previously
// cached value is easy to find by removing the least significant set
// bit from the current upperBound, since each step of the search adds
// least significant set bit.
//
// Finding the predecessor could also be implemented by iterating over
// the map to find the largest key that is smaller than upperBound if
// this approach deemed too complex.
trail := bits.TrailingZeros64(upperBound)
prev := uint64(0)
if trail != 64 {
prev = upperBound - (1 << uint64(trail))
}
// Sometimes the github API is not monotonically increasing, or ouputs
// an erroneous value of 0, or 1. This logic makes sure that it was not
// erroneous, and that the sequence continues to be monotonic by setting
// the current query count to match the previous value. which at least
// guarantees that the range search terminates.
//
// On the other hand, if files are added, then we way loose out on some
// files in a reviously completed range, but these files should be there
// the next time the crawler runs, so this is not really problematic.
retryMonotonicCount := 4
for result.Parsed.TotalCount < c.cache[prev] {
logger.Printf(
"Retrying query... current lower bound: %d, got: %d\n",
c.cache[prev], result.Parsed.TotalCount)
result = c.gcl.parseGithubResponseWithRetry(rangeRequest)
if result.Error != nil {
return count, result.Error
}
retryMonotonicCount--
if retryMonotonicCount <= 0 {
result.Parsed.TotalCount = c.cache[prev]
logger.Println(
"Retries for monotonic check exceeded,",
" setting value to match predecessor")
}
}
count = result.Parsed.TotalCount
logger.Printf("Caching new query %s, with count %d (incomplete_results: %v)\n",
sizeRange.RangeString(), count, result.Parsed.IncompleteResults)
c.cache[upperBound] = count
return count, nil
}
func (c githubCachedSearch) RequestString(filesize rangeFormatter) string {
return c.baseRequest.CopyWith(Filesize(filesize)).URL()
}
// Outputs a (possibly incomplete) list of ranges to query to find most search
// results as permissible by the search github search API. Github search only
// allows 1,000 results per query (paginated).
// Source: https://developer.github.com/v3/search/
//
// This leaves the possibility of having file sizes with more than 1000 results,
// This would mean that the search as it is could not find all files. If queries
// are sorted by last indexed, and retrieved on regular intervals, it should be
// sufficient to get most if not all documents.
func FindRangesForRepoSearch(cache cachedSearch, lowerBound, upperBound uint64) ([]string, error) {
totalFiles, err := cache.CountResults(lowerBound, upperBound)
if err != nil {
return nil, err
}
logger.Println("total kustomization files: ", totalFiles)
if githubMaxResultsPerQuery >= totalFiles {
return []string{
cache.RequestString(RangeWithin{lowerBound, upperBound}),
}, nil
}
// Find all the ranges of file sizes such that all files are queryable
// using the Github API. This does not compute an optimal ranges, since
// the number of queries needed to get the information required to
// compute an optimal range is expected to be much larger than the
// number of queries performed this way.
//
// The number of ranges is k = (number of files)/1000, and finding a
// range is logarithmic in the max file size (n = filesize). This means
// that preprocessing takes O(k * lg n) queries to find the ranges with
// a binary search over file sizes.
//
// My intuition is that this approach is competitive to a perfectly
// optimal solution, but I didn't actually take the time to do a
// rigorous proof. Intuitively, since files sizes are typically power
// law distibuted the binary search will be very skewed towards the
// smaller file ranges. This means that in practice this approach will
// make fewer than (#files/1000)*(log(n) = 19) queries for
// preprocessing, since it reuses a lot of the queries in the denser
// ranges. Furthermore, because of the distribution, it should be very
// easy to find ranges that are very close to the upper bound, up to
// the limiting factor of having no more than 1000 files accessible per
// range.
filesAccessible := uint64(0)
sizes := make([]uint64, 0)
sizes = append(sizes, lowerBound)
for filesAccessible < totalFiles {
target := filesAccessible + githubMaxResultsPerQuery
if target >= totalFiles {
break
}
logger.Printf("%d accessible files, next target = %d\n",
filesAccessible, target)
size, err := FindFileSize(cache, target, lowerBound, upperBound)
if err != nil {
return nil, err
}
// If there are more than 1000 files in the next bucket, we must
// advance anyway and lose out on some files :(.
if l := len(sizes); l > 0 && sizes[l-1] == size {
size++
}
nextAccessible, err := cache.CountResults(lowerBound, size)
if err != nil {
return nil, fmt.Errorf(
"cache should be populated at %d already, got %v",
size, err)
}
if nextAccessible < filesAccessible {
return nil, fmt.Errorf(
"number of results dropped from %d to %d within range search",
filesAccessible, nextAccessible)
}
filesAccessible = nextAccessible
if nextAccessible < totalFiles {
sizes = append(sizes, size)
}
}
sizes = append(sizes, upperBound)
return formatFilesizeRanges(cache, sizes), nil
}
// FindFileSize finds the filesize range from [lowerBound, return value] that has
// the largest file count that is smaller than or equal to
// githubMaxResultsPerQuery. It is important to note that this returned value
// could already be in a previous range if the next file size has more than 1000
// results. It is left to the caller to handle this bit of logic and guarantee
// forward progession in this case.
func FindFileSize(
cache cachedSearch, targetFileCount, lowerBound, upperBound uint64) (uint64, error) {
// Binary search for file sizes that make up the next <=1000 element
// chunk.
cur := lowerBound
increase := (upperBound - lowerBound) / 2
for increase > 0 {
mid := cur + increase
count, err := cache.CountResults(lowerBound, mid)
if err != nil {
return count, err
}
if count <= targetFileCount {
cur = mid
}
if count == targetFileCount {
break
}
increase /= 2
}
return cur, nil
}
func formatFilesizeRanges(cache cachedSearch, sizes []uint64) []string {
n := len(sizes)
if n < 2 {
return []string{}
}
ranges := make([]string, 0, n-1)
ranges = append(ranges, cache.RequestString(RangeWithin{sizes[0], sizes[1]}))
for i := 1; i < n-1; i++ {
ranges = append(ranges, cache.RequestString(RangeWithin{sizes[i] + 1, sizes[i+1]}))
}
return ranges
}
func RangeSizes(s string) RangeWithin {
start := strings.Index(s, "+size:") + len("+size:")
end := strings.Index(s, "&")
ranges := strings.Split(s[start:end], "..")
lowerBound, _ := strconv.ParseUint(ranges[0], 10, 64)
upperBound, _ := strconv.ParseUint(ranges[1], 10, 64)
return RangeWithin{lowerBound, upperBound}
}

View File

@@ -1,101 +0,0 @@
package github
import (
"fmt"
"log"
"reflect"
"testing"
)
type testCachedSearch struct {
cache map[uint64]uint64
}
func (c testCachedSearch) CountResults(lowerBound, upperBound uint64) (uint64, error) {
log.Printf("CountResults(%05x)\n", upperBound)
count, ok := c.cache[upperBound]
if !ok {
return count, fmt.Errorf("cache not set at %x", upperBound)
}
return count, nil
}
func (c testCachedSearch) RequestString(filesize rangeFormatter) string {
return filesize.RangeString()
}
// TODO(damienr74) make tests easier to write.. I'm thinking I can make the test
// cache take in a list of (filesize, count) pairs and it can populate the cache
// without relying on how the implementation will create queries. This was only
// a quick and dirty test to make sure that modifications are not going to break
// the functionality.
func TestRangeSplitting(t *testing.T) {
// Keys follow the binary search depending on whether or not the range
// is too small/large to find close to optimal filesize ranges. This
// test is heavily tied to the fact that the search is using powers of two
// to make progress in the search (hence the use of hexadecimal values).
cache := testCachedSearch{
map[uint64]uint64{
0x80000: 5000,
0x40000: 5000,
0x20000: 5000,
0x10000: 5000,
0x08000: 5000,
0x04000: 5000,
0x02000: 5000,
0x01000: 5000,
0x00fff: 3950,
0x00ffe: 3950,
0x00ffc: 3950,
0x00ff8: 3950,
0x00ff0: 3950,
0x00fe0: 3950,
0x00fc0: 3950,
0x00f80: 3950,
0x00f00: 3950,
0x00e00: 3950,
0x00c00: 3950,
0x00800: 3950,
0x00400: 3950,
0x00200: 3688,
0x00180: 3028,
0x00100: 2999,
0x000c0: 2448,
0x00080: 1999,
0x00070: 1600,
0x0006c: 1003,
0x0006b: 1001,
0x0006a: 999,
0x00068: 999,
0x00060: 999,
0x00040: 999,
0x00000: 0,
},
}
requests, err := FindRangesForRepoSearch(cache, 0, 524288)
if err != nil {
t.Errorf("Error while finding ranges: %v", err)
}
expected := []string{
"0..106", // cache.RequestString(RangeWithin{0x00, 0x6a}),
"107..128", // cache.RequestString(RangeWithin{0x6b, 0x80}),
"129..256", // cache.RequestString(RangeWithin{0x81, 0x100}),
"257..4095", // cache.RequestString(RangeWithin{0x101, 0xfff}),
"4096..524288", // cache.RequestString(RangeWithin{0x1000, 0x80000}),
}
if !reflect.DeepEqual(requests, expected) {
t.Errorf("Expected requests (%v) to equal (%v)", requests, expected)
}
}
func TestRangeSizes(t *testing.T) {
s := "https://api.github.com/search/code?q=filename:kustomization.yaml+filename:kustomization.yml" +
"+filename:kustomization+size:2365..10000&order=desc&per_page=100&sort=indexed"
returnedResult := RangeSizes(s)
expectedResult := RangeWithin{uint64(2365), uint64(10000)}
if !reflect.DeepEqual(returnedResult, expectedResult) {
t.Errorf("RangeSizes expected (%v), got (%v)", expectedResult, returnedResult)
}
}

View File

@@ -1,268 +0,0 @@
package doc
import (
"fmt"
"log"
"path/filepath"
"sort"
"strings"
"sigs.k8s.io/kustomize/api/konfig"
"sigs.k8s.io/kustomize/api/provider"
"sigs.k8s.io/kustomize/api/resource"
"sigs.k8s.io/kustomize/api/types"
"sigs.k8s.io/yaml"
)
// This document is meant to be used at the elasticsearch document type.
// Fields are serialized as-is to elasticsearch, where indices are built
// to facilitate text search queries. Identifiers, Values, FilePath,
// RepositoryURL and DocumentData are meant to be searched for text queries
// directly, while the other fields can either be used as a filter, or as
// additional metadata displayed in the UI.
//
// The fields of the document and their purpose are listed below:
// - DocumentData contains the contents of the kustomization file.
// - Kinds Represents the kubernetes Kinds that are in this file.
// - Identifiers are a list of (partial and full) identifier paths that can be
// found by users. Each part of a path is delimited by ":" e.g. spec:replicas.
// - Values are a list of identifier paths and their values that can be found by
// search queries. The path is delimited by ":" and the value follows the "="
// symbol e.g. spec:replicas=4.
// - FilePath is the path of the file.
// - RepositoryURL is the URL of the source repository.
// - CreationTime is the time at which the file was created.
//
// Representing each Identifier and Value as a flat string representation
// facilitates the use of complex text search features from elasticsearch such
// as fuzzy searching, regex, wildcards, etc.
type KustomizationDocument struct {
Document
Kinds []string `json:"kinds,omitempty"`
Identifiers []string `json:"identifiers,omitempty"`
Values []string `json:"values,omitempty"`
resFactory *resource.Factory
}
type set map[string]struct{}
func (doc *KustomizationDocument) Copy() *KustomizationDocument {
return &KustomizationDocument{
Document: *(doc.Document.Copy()),
Kinds: doc.Kinds,
Identifiers: doc.Identifiers,
Values: doc.Values,
resFactory: provider.NewDefaultDepProvider().GetResourceFactory(),
}
}
func (doc *KustomizationDocument) String() string {
return fmt.Sprintf("%s %s %s %v %v %v len(identifiers):%v len(values):%v",
doc.RepositoryURL, doc.FilePath, doc.DefaultBranch, doc.CreationTime,
doc.IsSame, doc.Kinds, len(doc.Identifiers), len(doc.Values))
}
// IsKustomizationFile determines whether a file path is a kustomization file
func IsKustomizationFile(path string) bool {
basename := filepath.Base(path)
for _, name := range konfig.RecognizedKustomizationFileNames() {
if basename == name {
return true
}
}
return false
}
// Implements the CrawlerDocument interface.
func (doc *KustomizationDocument) GetResources(
includeResources, includeTransformers, includeGenerators bool) ([]*Document, error) {
if !IsKustomizationFile(doc.FilePath) {
return []*Document{}, nil
}
content := []byte(doc.DocumentData)
content, err := FixKustomizationPreUnmarshallingNonFatal(content)
if err != nil {
return nil, fmt.Errorf("could not fix kustomize file: %v", err)
}
var k types.Kustomization
err = yaml.Unmarshal(content, &k)
if err != nil {
return nil, fmt.Errorf(
"could not parse kustomization: %v", err)
}
k.FixKustomizationPostUnmarshalling()
res := make([]*Document, 0)
if includeResources {
resourceDocs := doc.CollectDocuments(k.Resources, "resource")
res = append(res, resourceDocs...)
}
if includeGenerators {
generatorDocs := doc.CollectDocuments(k.Generators, "generator")
res = append(res, generatorDocs...)
}
if includeTransformers {
transformerDocs := doc.CollectDocuments(k.Transformers, "transformer")
res = append(res, transformerDocs...)
}
return res, nil
}
// CollectDocuments construct a Document for each path in paths, and return
// a slice of Document pointers.
func (doc *KustomizationDocument) CollectDocuments(
paths []string, fileType string) []*Document {
docs := make([]*Document, 0, len(paths))
for _, r := range paths {
if strings.TrimSpace(r) == "" {
continue
}
next, err := doc.Document.FromRelativePath(r)
if err != nil {
log.Printf("CollectDocuments error: %v\n", err)
continue
}
next.FileType = fileType
docs = append(docs, &next)
}
return docs
}
func (doc *KustomizationDocument) readBytes() ([]map[string]interface{}, error) {
data := []byte(doc.DocumentData)
for _, suffix := range konfig.RecognizedKustomizationFileNames() {
if !strings.HasSuffix(doc.FilePath, "/"+suffix) {
continue
}
var config map[string]interface{}
err := yaml.Unmarshal(data, &config)
if err != nil {
return nil, fmt.Errorf(
"unable to parse kustomization: %v", err)
}
return []map[string]interface{}{config}, nil
}
configs := make([]map[string]interface{}, 0)
ks, err := doc.resFactory.SliceFromBytes(data)
if err != nil {
return nil, fmt.Errorf("unable to parse resource: %v", err)
}
for _, k := range ks {
configs = append(configs, k.Map())
}
return configs, nil
}
// ParseYAML parses doc.Document and sets the following fields of doc:
// Kinds, Values, Identifiers.
func (doc *KustomizationDocument) ParseYAML() error {
doc.Identifiers = make([]string, 0)
doc.Values = make([]string, 0)
doc.Kinds = make([]string, 0, 1)
identifierSet := make(set)
valueSet := make(set)
kindSet := make(set)
getKind := func(m map[string]interface{}) string {
const defaultStr = "Kustomization"
kind, ok := m["kind"]
if !ok {
return defaultStr
}
if str, ok := kind.(string); ok && str != "" {
return str
}
return defaultStr
}
ks, err := doc.readBytes()
if err != nil {
return err
}
for _, contents := range ks {
kindSet[getKind(contents)] = struct{}{}
createFlatStructure(identifierSet, valueSet, contents)
}
for val := range kindSet {
doc.Kinds = append(doc.Kinds, val)
}
for val := range valueSet {
doc.Values = append(doc.Values, val)
}
for key := range identifierSet {
doc.Identifiers = append(doc.Identifiers, key)
}
// Without sorting these fields, every time when the string order in these fields changes,
// the document in the index will be updated.
// Sorting these fields are necessary to avoid a document being updated unnecessarily.
sort.Strings(doc.Kinds)
sort.Strings(doc.Values)
sort.Strings(doc.Identifiers)
return nil
}
func createFlatStructure(identifierSet set, valueSet set, contents map[string]interface{}) {
type Map struct {
data map[string]interface{}
prefix string
}
toVisit := []Map{
{
data: contents,
prefix: "",
},
}
for i := 0; i < len(toVisit); i++ {
visiting := toVisit[i]
for k, v := range visiting.data {
identifier := fmt.Sprintf("%s:%s", visiting.prefix, k)
// noop after the first iteration.
identifier = strings.TrimLeft(identifier, ":")
// Recursive function traverses structure to find
// identifiers and values. These later get formatted
// into doc.Identifiers and doc.Values respectively.
var traverseStructure func(interface{})
traverseStructure = func(arg interface{}) {
switch value := arg.(type) {
case map[string]interface{}:
toVisit = append(toVisit, Map{
data: value,
prefix: identifier,
})
case []interface{}:
for _, val := range value {
traverseStructure(val)
}
case interface{}:
esc := fmt.Sprintf("%v", value)
valuePath := fmt.Sprintf("%s=%v",
identifier, esc)
valueSet[valuePath] = struct{}{}
}
}
traverseStructure(v)
identifierSet[identifier] = struct{}{}
}
}
}

View File

@@ -1,379 +0,0 @@
package doc
import (
"reflect"
"sort"
"strings"
"testing"
)
func TestParseYAML(t *testing.T) {
testCases := []struct {
identifiers []string
values []string
kinds []string
filepath string
yaml string
}{
{
identifiers: []string{
"namePrefix",
"metadata",
"metadata:name",
"kind",
},
values: []string{
"kind=",
"namePrefix=dev-",
"metadata:name=app",
},
kinds: []string{
"Kustomization",
},
filepath: "some/path/to/kustomization.yaml",
yaml: `
namePrefix: dev-
metadata:
name: app
kind: ""
`,
},
{
identifiers: []string{
"namePrefix",
"metadata",
"metadata:name",
"metadata:spec",
"metadata:spec:replicas",
"kind",
"replicas",
"replicas:name",
"replicas:count",
"resource",
},
values: []string{
"namePrefix=dev-",
"metadata:name=n1",
"metadata:spec:replicas=3",
"kind=Kustomization",
"replicas:name=n1",
"replicas:name=n2",
"replicas:count=3",
"resource=file1.yaml",
"resource=file2.yaml",
},
kinds: []string{
"Kustomization",
},
filepath: "./kustomization.yaml",
yaml: `
namePrefix: dev-
# map of map
metadata:
name: n1
spec:
replicas: 3
kind: Kustomization
#list of map
replicas:
- name: n1
count: 3
- name: n2
count: 3
# list
resource:
- file1.yaml
- file2.yaml
`,
},
{
identifiers: []string{
"kind",
"metadata",
"metadata:name",
},
values: []string{
"kind=Deployment",
"kind=Service",
"kind=Custom",
"metadata:name=app",
"metadata:name=app-service",
"metadata:name=app-crd",
},
kinds: []string{
"Deployment",
"Service",
"Custom",
},
filepath: "resources.yaml",
yaml: `
---
kind: Deployment
metadata:
name: app
---
kind: Service
metadata:
name: app-service
---
kind: Custom
metadata:
name: app-crd
`,
},
{
identifiers: []string{
"kind",
"metadata",
"metadata:name",
},
values: []string{
"kind=Deployment",
"kind=Service",
"metadata:name=app1",
"metadata:name=app2",
},
kinds: []string{
"Deployment",
"Service",
},
filepath: "resources.yaml",
yaml: `
---
kind: Deployment
metadata:
name: app1
---
kind: Deployment
metadata:
name: app2
---
kind: Service
metadata:
name: app1
`,
},
}
for _, test := range testCases {
doc := KustomizationDocument{
Document: Document{
DocumentData: test.yaml,
FilePath: test.filepath,
},
}
err := doc.ParseYAML()
if err != nil {
t.Errorf("Document error error: %s", err)
}
cmpStrings := func(got, expected []string, label string) {
sort.Strings(got)
sort.Strings(expected)
if !reflect.DeepEqual(got, expected) {
t.Errorf("Expected %s (%v) to be equal to (%v)\n",
label,
strings.Join(got, ","),
strings.Join(expected, ","))
}
}
cmpStrings(doc.Identifiers, test.identifiers, "identifiers")
cmpStrings(doc.Values, test.values, "values")
cmpStrings(doc.Kinds, test.kinds, "kinds")
}
}
type TestStructForGetResources struct {
doc KustomizationDocument
resources []*Document
}
func TestGetResources(t *testing.T) {
tests := []TestStructForGetResources{
{
doc: KustomizationDocument{
Document: Document{
RepositoryURL: "sigs.k8s.io/kustomize",
FilePath: "some/path/to/kdir/kustomization.yaml",
DocumentData: `
bases:
- ../base
- ../otherbase
resources:
- file.yaml
- https://github.com/kubernetes-sigs/kustomize/examples/helloWorld?ref=v3.1.0
`},
},
resources: []*Document{
{
RepositoryURL: "sigs.k8s.io/kustomize",
FilePath: "some/path/to/base",
FileType: "resource",
User: "sigs.k8s.io",
},
{
RepositoryURL: "sigs.k8s.io/kustomize",
FilePath: "some/path/to/otherbase",
FileType: "resource",
User: "sigs.k8s.io",
},
{
RepositoryURL: "sigs.k8s.io/kustomize",
FilePath: "some/path/to/kdir/file.yaml",
FileType: "resource",
User: "sigs.k8s.io",
},
{
RepositoryURL: "https://github.com/kubernetes-sigs/kustomize",
FilePath: "examples/helloWorld",
DefaultBranch: "v3.1.0",
FileType: "resource",
User: "kubernetes-sigs",
},
},
},
{
doc: KustomizationDocument{
Document: Document{
RepositoryURL: "https://github.com/some/repo",
FilePath: "some/resource.yaml",
DocumentData: `
bases:
- ../base
- ../overlay
resources:
- https://github.com/kubernetes-sigs/kustomize/examples/helloWorld?ref=v3.1.0
- some/file.yaml
`,
},
},
resources: []*Document{},
},
}
runTest(t, tests, true, false, false)
}
func runTest(t *testing.T, tests []TestStructForGetResources, includeResources, includeTransformers, includeGenerators bool) {
for _, test := range tests {
res, err := test.doc.GetResources(includeResources, includeTransformers, includeGenerators)
if err != nil {
t.Errorf("Unexpected error: %v\n", err)
continue
}
if len(test.resources) != len(res) {
t.Errorf("Number of resources does not match.")
continue
}
cmp := func(docs []*Document) func(i, j int) bool {
return func(i, j int) bool {
if docs[i].RepositoryURL != docs[j].RepositoryURL {
return docs[i].RepositoryURL <
docs[j].RepositoryURL
}
if docs[i].FilePath != docs[j].FilePath {
return docs[i].FilePath <
docs[j].FilePath
}
return docs[i].DefaultBranch < docs[j].DefaultBranch
}
}
sort.Slice(test.resources, cmp(test.resources))
sort.Slice(res, cmp(res))
for i, r := range test.resources {
if !reflect.DeepEqual(res[i], r) {
t.Errorf("Expected '%+v' to equal '%+v'\n",
res[i], r)
}
}
}
}
func TestGetResourcesAndGenerators(t *testing.T) {
tests := []TestStructForGetResources{
{
doc: KustomizationDocument{
Document: Document{
RepositoryURL: "sigs.k8s.io/kustomize",
FilePath: "some/path/to/kdir/kustomization.yaml",
DocumentData: `
resources:
- file.yaml
generators:
- gen.yaml
transformers:
- tr.yaml
`},
},
resources: []*Document{
{
RepositoryURL: "sigs.k8s.io/kustomize",
FilePath: "some/path/to/kdir/gen.yaml",
FileType: "generator",
User: "sigs.k8s.io",
},
{
RepositoryURL: "sigs.k8s.io/kustomize",
FilePath: "some/path/to/kdir/file.yaml",
FileType: "resource",
User: "sigs.k8s.io",
},
},
},
}
runTest(t, tests, true, false, true)
}
func TestGetResourcesAndGeneratorsAndTransformers(t *testing.T) {
tests := []TestStructForGetResources{
{
doc: KustomizationDocument{
Document: Document{
RepositoryURL: "sigs.k8s.io/kustomize",
FilePath: "some/path/to/kdir/kustomization.yaml",
DocumentData: `
resources:
- file.yaml
generators:
- gen.yaml
transformers:
- tr.yaml
`},
},
resources: []*Document{
{
RepositoryURL: "sigs.k8s.io/kustomize",
FilePath: "some/path/to/kdir/tr.yaml",
FileType: "transformer",
User: "sigs.k8s.io",
},
{
RepositoryURL: "sigs.k8s.io/kustomize",
FilePath: "some/path/to/kdir/gen.yaml",
FileType: "generator",
User: "sigs.k8s.io",
},
{
RepositoryURL: "sigs.k8s.io/kustomize",
FilePath: "some/path/to/kdir/file.yaml",
FileType: "resource",
User: "sigs.k8s.io",
},
},
},
}
runTest(t, tests, true, true, true)
}

View File

@@ -1,124 +0,0 @@
package doc
import (
"crypto/sha256"
"fmt"
"path"
"strings"
"time"
"sigs.k8s.io/kustomize/api/internal/git"
)
type Document struct {
RepositoryURL string `json:"repositoryUrl,omitempty"`
// User makes it easy to aggregate data in the user level instead
// of the repository level
User string `json:"user,omitempty"`
FilePath string `json:"filePath,omitempty"`
DefaultBranch string `json:"defaultBranch,omitempty"`
DocumentData string `json:"document,omitempty"`
CreationTime *time.Time `json:"creationTime,omitempty"`
IsSame bool `json:"-"`
// FileType can be one of the following:
// "generator", "transformer", "resource", "".
FileType string `json:"fileType,omitempty"`
}
// Implements the CrawlerDocument interface.
func (doc *Document) GetDocument() *Document {
return doc
}
func (doc *Document) Copy() *Document {
return &Document{
RepositoryURL: doc.RepositoryURL,
User: doc.User,
FilePath: doc.FilePath,
DefaultBranch: doc.DefaultBranch,
DocumentData: doc.DocumentData,
CreationTime: doc.CreationTime,
IsSame: doc.IsSame,
FileType: doc.FileType,
}
}
func (doc *Document) Path() string {
return fmt.Sprintf("repoURL: %s filePath: %s branch: %s",
doc.RepositoryURL, doc.FilePath, doc.DefaultBranch)
}
// Implements the CrawlerDocument interface.
func (doc *Document) WasCached() bool {
return doc.IsSame
}
func (doc *Document) FromRelativePath(newFile string) (Document, error) {
repoSpec, err := git.NewRepoSpecFromUrl(newFile)
if err == nil {
return Document{
RepositoryURL: repoSpec.Host + path.Clean(repoSpec.OrgRepo),
FilePath: path.Clean(repoSpec.Path),
DefaultBranch: repoSpec.Ref,
User: UserName(repoSpec.Host + path.Clean(repoSpec.OrgRepo)),
}, nil
}
// else document is probably relative path.
ret := Document{
RepositoryURL: doc.RepositoryURL,
DefaultBranch: doc.DefaultBranch,
User: UserName(doc.RepositoryURL),
}
ogDir, _ := path.Split(doc.FilePath)
cleaned := path.Clean(newFile)
if !path.IsAbs(cleaned) {
cleaned = path.Clean(ogDir + "/" + cleaned)
}
ret.FilePath = cleaned
return ret, nil
}
func (doc *Document) ID() string {
sum := sha256.Sum256([]byte(strings.Join(
[]string{
doc.RepositoryURL,
doc.DefaultBranch,
doc.FilePath,
},
"---|---")))
return fmt.Sprintf("%x", sum)
}
func (doc *Document) RepositoryFullName() string {
url := TrimUrl(doc.RepositoryURL)
sections := strings.Split(url, "/")
l := len(sections)
if l < 2 {
return url
}
return path.Join(sections[l-2], sections[l-1])
}
// TrimUrl removes all the trailing slashes and the "git@github.com:" prefix (if exists).
func TrimUrl(s string) string {
url := strings.TrimRight(s, "/")
gitPrefix := "git@github.com:"
if strings.HasPrefix(url, gitPrefix) {
url = url[len(gitPrefix):]
}
return url
}
func UserName(repositoryURL string) string {
url := TrimUrl(repositoryURL)
sections := strings.Split(url, "/")
l := len(sections)
if l < 2 {
return url
}
return sections[l-2]
}

View File

@@ -1,150 +0,0 @@
package doc
import (
"reflect"
"testing"
)
func TestFromRelativePath(t *testing.T) {
type Case struct {
RelativePath string
Expected Document
}
testCases := []struct {
BaseDoc Document
Cases []Case
}{
{
BaseDoc: Document{
RepositoryURL: "example.com/repo",
FilePath: "path/to/file/kustomization.yaml",
DefaultBranch: "master",
},
Cases: []Case{
{
RelativePath: "../other/file/resource.yaml",
Expected: Document{
RepositoryURL: "example.com/repo",
FilePath: "path/to/other/file/resource.yaml",
DefaultBranch: "master",
User: "example.com",
},
},
{
RelativePath: "../file/../../something/../to/other/file/patch.yaml",
Expected: Document{
RepositoryURL: "example.com/repo",
FilePath: "path/to/other/file/patch.yaml",
DefaultBranch: "master",
User: "example.com",
},
},
{
RelativePath: "service.yaml",
Expected: Document{
RepositoryURL: "example.com/repo",
FilePath: "path/to/file/service.yaml",
DefaultBranch: "master",
User: "example.com",
},
},
},
},
}
for _, tc := range testCases {
for _, c := range tc.Cases {
rd, err := tc.BaseDoc.FromRelativePath(c.RelativePath)
if err != nil {
t.Errorf("unexpected error: %v", err)
}
if !reflect.DeepEqual(rd, c.Expected) {
t.Errorf("document mismatch expected %v, got %v", c.Expected, rd)
}
}
}
}
func TestDocument_RepositoryFullName(t *testing.T) {
testCases := []struct {
doc Document
expectedRepositoryFullName string
}{
{
doc: Document{
RepositoryURL: "https://github.com/user/repo",
},
expectedRepositoryFullName: "user/repo",
},
{
doc: Document{
RepositoryURL: "https://github.com//user/repo////",
},
expectedRepositoryFullName: "user/repo",
},
{
doc: Document{
RepositoryURL: "repo/",
},
expectedRepositoryFullName: "repo",
},
{
doc: Document{
RepositoryURL: "",
},
expectedRepositoryFullName: "",
},
{
doc: Document{
RepositoryURL: "git@github.com:user/repo",
},
expectedRepositoryFullName: "user/repo",
},
}
for _, tc := range testCases {
returnedRepositoryFullName := tc.doc.RepositoryFullName()
if returnedRepositoryFullName != tc.expectedRepositoryFullName {
t.Errorf("RepositoryFullName expected %s, got %s",
tc.expectedRepositoryFullName,
returnedRepositoryFullName)
}
}
}
func TestDocument_UserName(t *testing.T) {
testCases := []struct {
repositoryURL string
expectedUserName string
}{
{
repositoryURL: "https://github.com/user/repo",
expectedUserName: "user",
},
{
repositoryURL: "https://github.com//user/repo////",
expectedUserName: "user",
},
{
repositoryURL: "repo/",
expectedUserName: "repo",
},
{
repositoryURL: "",
expectedUserName: "",
},
{
repositoryURL: "git@github.com:user/repo",
expectedUserName: "user",
},
}
for _, tc := range testCases {
returnedUserName := UserName(tc.repositoryURL)
if returnedUserName != tc.expectedUserName {
t.Errorf("UserName expected %s, got %s",
tc.expectedUserName, returnedUserName)
}
}
}

View File

@@ -1,51 +0,0 @@
package doc
import (
"fmt"
"regexp"
"sigs.k8s.io/yaml"
)
func FixKustomizationPreUnmarshallingNonFatal(data []byte) ([]byte, error) {
deprecateFieldsMap := map[string]string{
"imageTags:": "images:",
}
for oldname, newname := range deprecateFieldsMap {
pattern := regexp.MustCompile(oldname)
data = pattern.ReplaceAll(data, []byte(newname))
}
found, err := useLegacyPatch(data)
if err == nil && found {
pattern := regexp.MustCompile("patches:")
data = pattern.ReplaceAll(data, []byte("patchesStrategicMerge:"))
}
return data, err
}
func useLegacyPatch(data []byte) (bool, error) {
found := false
var object map[string]interface{}
err := yaml.Unmarshal(data, &object)
if err != nil {
return false, fmt.Errorf("invalid content from %s",
string(data))
}
if rawPatches, ok := object["patches"]; ok {
patches, ok := rawPatches.([]interface{})
if !ok {
return false, fmt.Errorf("invalid patches from %v",
rawPatches)
}
for _, p := range patches {
_, ok := p.(string)
if ok {
found = true
}
}
}
return found, nil
}

View File

@@ -1,36 +0,0 @@
package doc
import (
"sigs.k8s.io/kustomize/api/internal/crawl/utils"
)
// UniqueDocuments make sure a Document with a given ID appears only once
type UniqueDocuments struct {
docs []*Document
docIDs utils.SeenMap
}
func NewUniqueDocuments() UniqueDocuments {
return UniqueDocuments{
docs: []*Document{},
docIDs: utils.NewSeenMap(),
}
}
func (uds *UniqueDocuments) Add(d *Document) {
if uds.docIDs.Seen(d.ID()) {
return
}
uds.docs = append(uds.docs, d)
uds.docIDs.Set(d.ID(), "")
}
func (uds *UniqueDocuments) AddDocuments(docs []*Document) {
for _, d := range docs {
uds.Add(d)
}
}
func (uds *UniqueDocuments) Documents() []*Document {
return uds.docs
}

View File

@@ -1,15 +0,0 @@
module sigs.k8s.io/kustomize/api/internal/crawl
go 1.16
require (
github.com/elastic/go-elasticsearch/v6 v6.8.5
github.com/gomodule/redigo v2.0.0+incompatible
github.com/gorilla/mux v1.7.3
github.com/gregjones/httpcache v0.0.0-20190611155906-901d90724c79
github.com/rs/cors v1.7.0
sigs.k8s.io/kustomize/api v0.0.0
sigs.k8s.io/yaml v1.2.0
)
replace sigs.k8s.io/kustomize/api v0.0.0 => ../../../api

View File

@@ -1,275 +0,0 @@
cloud.google.com/go v0.26.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw=
github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU=
github.com/OneOfOne/xxhash v1.2.2/go.mod h1:HSdplMjZKSmBqAxg5vPj2TmRDmfkzw+cTzAElWljhcU=
github.com/PuerkitoBio/purell v1.1.0/go.mod h1:c11w/QuzBsJSee3cPx9rAFu61PvFxuPbtSwDGJws/X0=
github.com/PuerkitoBio/purell v1.1.1 h1:WEQqlqaGbrPkxLJWfBwQmfEAE1Z7ONdDLqrN38tNFfI=
github.com/PuerkitoBio/purell v1.1.1/go.mod h1:c11w/QuzBsJSee3cPx9rAFu61PvFxuPbtSwDGJws/X0=
github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578 h1:d+Bc7a5rLufV/sSk/8dngufqelfh6jnri85riMAaF/M=
github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578/go.mod h1:uGdkoq3SwY9Y+13GIhn11/XLaGBb4BfwItxLd5jeuXE=
github.com/agnivade/levenshtein v1.0.1/go.mod h1:CURSv5d9Uaml+FovSIICkLbAUZ9S4RqaHDIsdSBg7lM=
github.com/alecthomas/template v0.0.0-20160405071501-a0175ee3bccc/go.mod h1:LOuyumcjzFXgccqObfd/Ljyb9UuFJ6TxHnclSeseNhc=
github.com/alecthomas/units v0.0.0-20151022065526-2efee857e7cf/go.mod h1:ybxpYRFXyAe+OPACYpWeL0wqObRcbAqCMya13uyzqw0=
github.com/andreyvit/diff v0.0.0-20170406064948-c7f18ee00883/go.mod h1:rCTlJbsFo29Kk6CurOXKm700vrz8f0KW0JNfpkRJY/8=
github.com/armon/consul-api v0.0.0-20180202201655-eb2c6b5be1b6/go.mod h1:grANhF5doyWs3UAsr3K4I6qtAmlQcZDesFNEHPZAzj8=
github.com/asaskevich/govalidator v0.0.0-20180720115003-f9ffefc3facf/go.mod h1:lB+ZfQJz7igIIfQNfa7Ml4HSf2uFQQRzpGGRXenZAgY=
github.com/asaskevich/govalidator v0.0.0-20190424111038-f61b66f89f4a/go.mod h1:lB+ZfQJz7igIIfQNfa7Ml4HSf2uFQQRzpGGRXenZAgY=
github.com/beorn7/perks v0.0.0-20180321164747-3a771d992973/go.mod h1:Dwedo/Wpr24TaqPxmxbtue+5NUziq4I4S80YR8gNf3Q=
github.com/beorn7/perks v1.0.0/go.mod h1:KWe93zE9D1o94FZ5RNwFwVgaQK1VOXiVxmqh+CedLV8=
github.com/cespare/xxhash v1.1.0/go.mod h1:XrSqR1VqqWfGrhpAt58auRo0WTKS1nRRg3ghfAqPWnc=
github.com/chzyer/logex v1.1.10/go.mod h1:+Ywpsq7O8HXn0nuIou7OrIPyXbp3wmkHB+jjWRnGsAI=
github.com/chzyer/readline v0.0.0-20180603132655-2972be24d48e/go.mod h1:nSuG5e5PlCu98SY8svDHJxuZscDgtXS6KTTbou5AhLI=
github.com/chzyer/test v0.0.0-20180213035817-a1ea475d72b1/go.mod h1:Q3SI9o4m/ZMnBNeIyt5eFwwo7qiLfzFZmjNmxjkiQlU=
github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw=
github.com/coreos/bbolt v1.3.2/go.mod h1:iRUV2dpdMOn7Bo10OQBFzIJO9kkE559Wcmn+qkEiiKk=
github.com/coreos/etcd v3.3.10+incompatible/go.mod h1:uF7uidLiAD3TWHmW31ZFd/JWoc32PjwdhPthX9715RE=
github.com/coreos/go-semver v0.2.0/go.mod h1:nnelYz7RCh+5ahJtPPxZlU+153eP4D4r3EedlOD2RNk=
github.com/coreos/go-systemd v0.0.0-20190321100706-95778dfbb74e/go.mod h1:F5haX7vjVVG0kc13fIWeqUViNPyEJxv/OmvnBo0Yme4=
github.com/coreos/pkg v0.0.0-20180928190104-399ea9e2e55f/go.mod h1:E3G3o1h8I7cfcXa63jLwjI0eiQQMgzzUDFVpN/nH/eA=
github.com/cpuguy83/go-md2man/v2 v2.0.0/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/dgrijalva/jwt-go v3.2.0+incompatible/go.mod h1:E3ru+11k8xSBh+hMPgOLZmtrrCbhqsmaPHjLKYnJCaQ=
github.com/dgryski/go-sip13 v0.0.0-20181026042036-e10d5fee7954/go.mod h1:vAd38F8PWV+bWy6jNmig1y/TA+kYO4g3RSRF0IAv0no=
github.com/docker/go-units v0.3.3/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
github.com/docker/go-units v0.4.0/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
github.com/elastic/go-elasticsearch/v6 v6.8.5 h1:U2HtkBseC1FNBmDr0TR2tKltL6FxoY+niDAlj5M8TK8=
github.com/elastic/go-elasticsearch/v6 v6.8.5/go.mod h1:UwaDJsD3rWLM5rKNFzv9hgox93HoX8utj1kxD9aFUcI=
github.com/evanphx/json-patch v4.5.0+incompatible/go.mod h1:50XU6AFN0ol/bzJsmQLiYLvXMP4fmwYFNcr97nuDLSk=
github.com/fsnotify/fsnotify v1.4.7/go.mod h1:jwhsz4b93w/PPRr/qN1Yymfu8t87LnFCMoQvtojpjFo=
github.com/ghodss/yaml v1.0.0/go.mod h1:4dBDuWmgqj2HViK6kFavaiC9ZROes6MMH2rRYeMEF04=
github.com/globalsign/mgo v0.0.0-20180905125535-1ca0a4f7cbcb/go.mod h1:xkRDCp4j0OGD1HRkm4kmhM+pmpv3AKq5SU7GMg4oO/Q=
github.com/globalsign/mgo v0.0.0-20181015135952-eeefdecb41b8/go.mod h1:xkRDCp4j0OGD1HRkm4kmhM+pmpv3AKq5SU7GMg4oO/Q=
github.com/go-errors/errors v1.0.1 h1:LUHzmkK3GUKUrL/1gfBUxAHzcev3apQlezX/+O7ma6w=
github.com/go-errors/errors v1.0.1/go.mod h1:f4zRHt4oKfwPJE5k8C9vpYG+aDHdBFUsgrm6/TyX73Q=
github.com/go-kit/kit v0.8.0/go.mod h1:xBxKIO96dXMWWy0MnWVtmwkA9/13aqxPnvrjFYMA2as=
github.com/go-logfmt/logfmt v0.3.0/go.mod h1:Qt1PoO58o5twSAckw1HlFXLmHsOX5/0LbT9GBnD5lWE=
github.com/go-logfmt/logfmt v0.4.0/go.mod h1:3RMwSq7FuexP4Kalkev3ejPJsZTpXXBr9+V4qmtdjCk=
github.com/go-openapi/analysis v0.0.0-20180825180245-b006789cd277/go.mod h1:k70tL6pCuVxPJOHXQ+wIac1FUrvNkHolPie/cLEU6hI=
github.com/go-openapi/analysis v0.17.0/go.mod h1:IowGgpVeD0vNm45So8nr+IcQ3pxVtpRoBWb8PVZO0ik=
github.com/go-openapi/analysis v0.18.0/go.mod h1:IowGgpVeD0vNm45So8nr+IcQ3pxVtpRoBWb8PVZO0ik=
github.com/go-openapi/analysis v0.19.2/go.mod h1:3P1osvZa9jKjb8ed2TPng3f0i/UY9snX6gxi44djMjk=
github.com/go-openapi/analysis v0.19.5/go.mod h1:hkEAkxagaIvIP7VTn8ygJNkd4kAYON2rCu0v0ObL0AU=
github.com/go-openapi/errors v0.17.0/go.mod h1:LcZQpmvG4wyF5j4IhA73wkLFQg+QJXOQHVjmcZxhka0=
github.com/go-openapi/errors v0.18.0/go.mod h1:LcZQpmvG4wyF5j4IhA73wkLFQg+QJXOQHVjmcZxhka0=
github.com/go-openapi/errors v0.19.2/go.mod h1:qX0BLWsyaKfvhluLejVpVNwNRdXZhEbTA4kxxpKBC94=
github.com/go-openapi/jsonpointer v0.17.0/go.mod h1:cOnomiV+CVVwFLk0A/MExoFMjwdsUdVpsRhURCKh+3M=
github.com/go-openapi/jsonpointer v0.18.0/go.mod h1:cOnomiV+CVVwFLk0A/MExoFMjwdsUdVpsRhURCKh+3M=
github.com/go-openapi/jsonpointer v0.19.2/go.mod h1:3akKfEdA7DF1sugOqz1dVQHBcuDBPKZGEoHC/NkiQRg=
github.com/go-openapi/jsonpointer v0.19.3 h1:gihV7YNZK1iK6Tgwwsxo2rJbD1GTbdm72325Bq8FI3w=
github.com/go-openapi/jsonpointer v0.19.3/go.mod h1:Pl9vOtqEWErmShwVjC8pYs9cog34VGT37dQOVbmoatg=
github.com/go-openapi/jsonreference v0.17.0/go.mod h1:g4xxGn04lDIRh0GJb5QlpE3HfopLOL6uZrK/VgnsK9I=
github.com/go-openapi/jsonreference v0.18.0/go.mod h1:g4xxGn04lDIRh0GJb5QlpE3HfopLOL6uZrK/VgnsK9I=
github.com/go-openapi/jsonreference v0.19.2/go.mod h1:jMjeRr2HHw6nAVajTXJ4eiUwohSTlpa0o73RUL1owJc=
github.com/go-openapi/jsonreference v0.19.3 h1:5cxNfTy0UVC3X8JL5ymxzyoUZmo8iZb+jeTWn7tUa8o=
github.com/go-openapi/jsonreference v0.19.3/go.mod h1:rjx6GuL8TTa9VaixXglHmQmIL98+wF9xc8zWvFonSJ8=
github.com/go-openapi/loads v0.17.0/go.mod h1:72tmFy5wsWx89uEVddd0RjRWPZm92WRLhf7AC+0+OOU=
github.com/go-openapi/loads v0.18.0/go.mod h1:72tmFy5wsWx89uEVddd0RjRWPZm92WRLhf7AC+0+OOU=
github.com/go-openapi/loads v0.19.0/go.mod h1:72tmFy5wsWx89uEVddd0RjRWPZm92WRLhf7AC+0+OOU=
github.com/go-openapi/loads v0.19.2/go.mod h1:QAskZPMX5V0C2gvfkGZzJlINuP7Hx/4+ix5jWFxsNPs=
github.com/go-openapi/loads v0.19.4/go.mod h1:zZVHonKd8DXyxyw4yfnVjPzBjIQcLt0CCsn0N0ZrQsk=
github.com/go-openapi/runtime v0.0.0-20180920151709-4f900dc2ade9/go.mod h1:6v9a6LTXWQCdL8k1AO3cvqx5OtZY/Y9wKTgaoP6YRfA=
github.com/go-openapi/runtime v0.19.0/go.mod h1:OwNfisksmmaZse4+gpV3Ne9AyMOlP1lt4sK4FXt0O64=
github.com/go-openapi/runtime v0.19.4/go.mod h1:X277bwSUBxVlCYR3r7xgZZGKVvBd/29gLDlFGtJ8NL4=
github.com/go-openapi/spec v0.17.0/go.mod h1:XkF/MOi14NmjsfZ8VtAKf8pIlbZzyoTvZsdfssdxcBI=
github.com/go-openapi/spec v0.18.0/go.mod h1:XkF/MOi14NmjsfZ8VtAKf8pIlbZzyoTvZsdfssdxcBI=
github.com/go-openapi/spec v0.19.2/go.mod h1:sCxk3jxKgioEJikev4fgkNmwS+3kuYdJtcsZsD5zxMY=
github.com/go-openapi/spec v0.19.3/go.mod h1:FpwSN1ksY1eteniUU7X0N/BgJ7a4WvBFVA8Lj9mJglo=
github.com/go-openapi/spec v0.19.5 h1:Xm0Ao53uqnk9QE/LlYV5DEU09UAgpliA85QoT9LzqPw=
github.com/go-openapi/spec v0.19.5/go.mod h1:Hm2Jr4jv8G1ciIAo+frC/Ft+rR2kQDh8JHKHb3gWUSk=
github.com/go-openapi/strfmt v0.17.0/go.mod h1:P82hnJI0CXkErkXi8IKjPbNBM6lV6+5pLP5l494TcyU=
github.com/go-openapi/strfmt v0.18.0/go.mod h1:P82hnJI0CXkErkXi8IKjPbNBM6lV6+5pLP5l494TcyU=
github.com/go-openapi/strfmt v0.19.0/go.mod h1:+uW+93UVvGGq2qGaZxdDeJqSAqBqBdl+ZPMF/cC8nDY=
github.com/go-openapi/strfmt v0.19.3/go.mod h1:0yX7dbo8mKIvc3XSKp7MNfxw4JytCfCD6+bY1AVL9LU=
github.com/go-openapi/strfmt v0.19.5/go.mod h1:eftuHTlB/dI8Uq8JJOyRlieZf+WkkxUuk0dgdHXr2Qk=
github.com/go-openapi/swag v0.17.0/go.mod h1:AByQ+nYG6gQg71GINrmuDXCPWdL640yX49/kXLo40Tg=
github.com/go-openapi/swag v0.18.0/go.mod h1:AByQ+nYG6gQg71GINrmuDXCPWdL640yX49/kXLo40Tg=
github.com/go-openapi/swag v0.19.2/go.mod h1:POnQmlKehdgb5mhVOsnJFsivZCEZ/vjK9gh66Z9tfKk=
github.com/go-openapi/swag v0.19.5 h1:lTz6Ys4CmqqCQmZPBlbQENR1/GucA2bzYTE12Pw4tFY=
github.com/go-openapi/swag v0.19.5/go.mod h1:POnQmlKehdgb5mhVOsnJFsivZCEZ/vjK9gh66Z9tfKk=
github.com/go-openapi/validate v0.18.0/go.mod h1:Uh4HdOzKt19xGIGm1qHf/ofbX1YQ4Y+MYsct2VUrAJ4=
github.com/go-openapi/validate v0.19.2/go.mod h1:1tRCw7m3jtI8eNWEEliiAqUIcBztB2KDnRCRMUi7GTA=
github.com/go-openapi/validate v0.19.8/go.mod h1:8DJv2CVJQ6kGNpFW6eV9N3JviE1C85nY1c2z52x1Gk4=
github.com/go-stack/stack v1.8.0/go.mod h1:v0f6uXyyMGvRgIKkXu+yp6POWl0qKG85gN/melR3HDY=
github.com/gobuffalo/here v0.6.0/go.mod h1:wAG085dHOYqUpf+Ap+WOdrPTp5IYcDAs/x7PLa8Y5fM=
github.com/gogo/protobuf v1.1.1/go.mod h1:r8qH/GZQm5c6nD/R0oafs1akxWv10x8SbQlK7atdtwQ=
github.com/gogo/protobuf v1.2.1/go.mod h1:hp+jE20tsWTFYpLwKvXlhS1hjn+gTNwPg2I6zVXpSg4=
github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b/go.mod h1:SBH7ygxi8pfUlaOkMMuAQtPIUF8ecWP5IEl/CR7VP2Q=
github.com/golang/groupcache v0.0.0-20190129154638-5b532d6fd5ef/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc=
github.com/golang/mock v1.1.1/go.mod h1:oTYuIxOrZwtPieC+H1uAHpcLFnEyAGVDL/k47Jfbm0A=
github.com/golang/protobuf v1.2.0/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
github.com/golang/protobuf v1.3.1/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
github.com/gomodule/redigo v2.0.0+incompatible h1:K/R+8tc58AaqLkqG2Ol3Qk+DR/TlNuhuh457pBFPtt0=
github.com/gomodule/redigo v2.0.0+incompatible/go.mod h1:B4C85qUVwatsJoIUNIfCRsp7qO0iAmpGFZ4EELWSbC4=
github.com/google/btree v1.0.0/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ=
github.com/google/go-cmp v0.2.0/go.mod h1:oXzfMopK8JAjlY9xF4vHSVASa0yLyX7SntLO5aqRK0M=
github.com/google/go-cmp v0.3.0/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
github.com/google/go-cmp v0.4.0 h1:xsAVV57WRhGj6kEIi8ReJzQlHHqcBYCElAvkovg3B/4=
github.com/google/go-cmp v0.4.0/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510/go.mod h1:pupxD2MaaD3pAXIBCelhxNneeOaAeabZDe5s4K6zSpQ=
github.com/google/uuid v1.0.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/google/uuid v1.1.1/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/gorilla/mux v1.7.3 h1:gnP5JzjVOuiZD07fKKToCAOjS0yOpj/qPETTXCCS6hw=
github.com/gorilla/mux v1.7.3/go.mod h1:1lud6UwP+6orDFRuTfBEV8e9/aOM/c4fVVCaMa2zaAs=
github.com/gorilla/websocket v1.4.0/go.mod h1:E7qHFY5m1UJ88s3WnNqhKjPHQ0heANvMoAMk2YaljkQ=
github.com/gregjones/httpcache v0.0.0-20190611155906-901d90724c79 h1:+ngKgrYPPJrOjhax5N+uePQ0Fh1Z7PheYoUI/0nzkPA=
github.com/gregjones/httpcache v0.0.0-20190611155906-901d90724c79/go.mod h1:FecbI9+v66THATjSRHfNgh1IVFe/9kFxbXtjV0ctIMA=
github.com/grpc-ecosystem/go-grpc-middleware v1.0.0/go.mod h1:FiyG127CGDf3tlThmgyCl78X/SZQqEOJBCDaAfeWzPs=
github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0/go.mod h1:8NvIoxWQoOIhqOTXgfV/d3M/q6VIi02HzZEHgUlZvzk=
github.com/grpc-ecosystem/grpc-gateway v1.9.0/go.mod h1:vNeuVxBJEsws4ogUvrchl83t/GYV9WGTSLVdBhOQFDY=
github.com/hashicorp/hcl v1.0.0/go.mod h1:E5yfLk+7swimpb2L/Alb/PJmXilQ/rhwaUYs4T20WEQ=
github.com/imdario/mergo v0.3.5/go.mod h1:2EnlNZ0deacrJVfApfmtdGgDfMuh/nq6Ok1EcJh5FfA=
github.com/inconshreveable/mousetrap v1.0.0/go.mod h1:PxqpIevigyE2G7u3NXJIT2ANytuPF1OarO4DADm73n8=
github.com/jonboulle/clockwork v0.1.0/go.mod h1:Ii8DK3G1RaLaWxj9trq07+26W01tbo22gdxWY5EU2bo=
github.com/julienschmidt/httprouter v1.2.0/go.mod h1:SYymIcj16QtmaHHD7aYtjjsJG7VTCxuUUipMqKk8s4w=
github.com/kisielk/errcheck v1.1.0/go.mod h1:EZBBE59ingxPouuu3KfxchcWSUPOHkagtvWXihfKN4Q=
github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck=
github.com/konsorten/go-windows-terminal-sequences v1.0.1/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ=
github.com/kr/logfmt v0.0.0-20140226030751-b84e30acd515/go.mod h1:+0opPa2QZZtGFBFZlji/RkVcI2GknAs/DXo4wKdlNEc=
github.com/kr/pretty v0.1.0 h1:L/CwN0zerZDmRFUapSPitk6f+Q3+0za1rQkzVuMiMFI=
github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo=
github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ=
github.com/kr/pty v1.1.5/go.mod h1:9r2w37qlBe7rQ6e1fg1S/9xpWHSnaqNdHD3WcMdbPDA=
github.com/kr/text v0.1.0 h1:45sCR5RtlFHMR4UwH9sdQ5TC8v0qDQCHnXt+kaKSTVE=
github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI=
github.com/magiconair/properties v1.8.0/go.mod h1:PppfXfuXeibc/6YijjN8zIbojt8czPbwD3XqdrwzmxQ=
github.com/mailru/easyjson v0.0.0-20180823135443-60711f1a8329/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc=
github.com/mailru/easyjson v0.0.0-20190312143242-1de009706dbe/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc=
github.com/mailru/easyjson v0.0.0-20190614124828-94de47d64c63/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc=
github.com/mailru/easyjson v0.0.0-20190626092158-b2ccc519800e/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc=
github.com/mailru/easyjson v0.7.0 h1:aizVhC/NAAcKWb+5QsU1iNOZb4Yws5UO2I+aIprQITM=
github.com/mailru/easyjson v0.7.0/go.mod h1:KAzv3t3aY1NaHWoQz1+4F1ccyAH66Jk7yos7ldAVICs=
github.com/markbates/pkger v0.17.1/go.mod h1:0JoVlrol20BSywW79rN3kdFFsE5xYM+rSCQDXbLhiuI=
github.com/matttproud/golang_protobuf_extensions v1.0.1/go.mod h1:D8He9yQNgCq6Z5Ld7szi9bcBfOoFv/3dc6xSMkL2PC0=
github.com/mitchellh/go-homedir v1.1.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0=
github.com/mitchellh/mapstructure v1.1.2/go.mod h1:FVVH3fgwuzCH5S8UJGiWEs2h04kUh9fWfEaFds41c1Y=
github.com/monochromegane/go-gitignore v0.0.0-20200626010858-205db1a8cc00 h1:n6/2gBQ3RWajuToeY6ZtZTIKv2v7ThUy5KKusIT0yc0=
github.com/monochromegane/go-gitignore v0.0.0-20200626010858-205db1a8cc00/go.mod h1:Pm3mSP3c5uWn86xMLZ5Sa7JB9GsEZySvHYXCTK4E9q4=
github.com/mwitkow/go-conntrack v0.0.0-20161129095857-cc309e4a2223/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U=
github.com/oklog/ulid v1.3.1/go.mod h1:CirwcVhetQ6Lv90oh/F+FBtV6XMibvdAFo93nm5qn4U=
github.com/pborman/uuid v1.2.0/go.mod h1:X/NO0urCmaxf9VXbdlT7C2Yzkj2IKimNn4k+gtPdI/k=
github.com/pelletier/go-toml v1.2.0/go.mod h1:5z9KED0ma1S8pY6P1sdut58dfprrGBbd/94hg7ilaic=
github.com/pkg/errors v0.8.0/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4=
github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/prometheus/client_golang v0.9.1/go.mod h1:7SWBe2y4D6OKWSNQJUaRYU/AaXPKyh/dDVn+NZz0KFw=
github.com/prometheus/client_golang v0.9.3/go.mod h1:/TN21ttK/J9q6uSwhBd54HahCDft0ttaMvbicHlPoso=
github.com/prometheus/client_model v0.0.0-20180712105110-5c3871d89910/go.mod h1:MbSGuTsp3dbXC40dX6PRTWyKYBIrTGTE9sqQNg2J8bo=
github.com/prometheus/client_model v0.0.0-20190129233127-fd36f4220a90/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA=
github.com/prometheus/common v0.0.0-20181113130724-41aa239b4cce/go.mod h1:daVV7qP5qjZbuso7PdcryaAu0sAZbrN9i7WWcTMWvro=
github.com/prometheus/common v0.4.0/go.mod h1:TNfzLD0ON7rHzMJeJkieUDPYmFC7Snx/y86RQel1bk4=
github.com/prometheus/procfs v0.0.0-20181005140218-185b4288413d/go.mod h1:c3At6R/oaqEKCNdg8wHV1ftS6bRYblBhIjjI8uT2IGk=
github.com/prometheus/procfs v0.0.0-20190507164030-5867b95ac084/go.mod h1:TjEm7ze935MbeOT/UhFTIMYKhuLP4wbCsTZCD3I8kEA=
github.com/prometheus/tsdb v0.7.1/go.mod h1:qhTCs0VvXwvX/y3TZrWD7rabWM+ijKTux40TwIPHuXU=
github.com/rogpeppe/fastuuid v0.0.0-20150106093220-6724a57986af/go.mod h1:XWv6SoW27p1b0cqNHllgS5HIMJraePCO15w5zCzIWYg=
github.com/rs/cors v1.7.0 h1:+88SsELBHx5r+hZ8TCkggzSstaWNbDvThkVK8H6f9ik=
github.com/rs/cors v1.7.0/go.mod h1:gFx+x8UowdsKA9AchylcLynDq+nNFfI8FkUZdN/jGCU=
github.com/russross/blackfriday/v2 v2.0.1/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/sergi/go-diff v1.0.0/go.mod h1:0CfEIISq7TuYL3j771MWULgwwjU+GofnZX9QAmXWZgo=
github.com/sergi/go-diff v1.1.0/go.mod h1:STckp+ISIX8hZLjrqAeVduY0gWCT9IjLuqbuNXdaHfM=
github.com/shurcooL/sanitized_anchor_name v1.0.0/go.mod h1:1NzhyTcUVG4SuEtjjoZeVRXNmyL/1OwPU0+IJeTBvfc=
github.com/sirupsen/logrus v1.2.0/go.mod h1:LxeOpSwHxABJmUn/MG1IvRgCAasNZTLOkJPxbbu5VWo=
github.com/soheilhy/cmux v0.1.4/go.mod h1:IM3LyeVVIOuxMH7sFAkER9+bJ4dT7Ms6E4xg4kGIyLM=
github.com/spaolacci/murmur3 v0.0.0-20180118202830-f09979ecbc72/go.mod h1:JwIasOWyU6f++ZhiEuf87xNszmSA2myDM2Kzu9HwQUA=
github.com/spf13/afero v1.1.2/go.mod h1:j4pytiNVoe2o6bmDsKpLACNPDBIoEAkihy7loJ1B0CQ=
github.com/spf13/cast v1.3.0/go.mod h1:Qx5cxh0v+4UWYiBimWS+eyWzqEqokIECu5etghLkUJE=
github.com/spf13/cobra v1.0.0/go.mod h1:/6GTrnGXV9HjY+aR4k0oJ5tcvakLuG6EuKReYlHNrgE=
github.com/spf13/jwalterweatherman v1.0.0/go.mod h1:cQK4TGJAtQXfYWX+Ddv3mKDzgVb68N+wFjFa4jdeBTo=
github.com/spf13/pflag v1.0.3/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4=
github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
github.com/spf13/viper v1.4.0/go.mod h1:PTJ7Z/lr49W6bUbkmS1V3by4uWynFiR9p7+dSq/yZzE=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/objx v0.1.1/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/objx v0.2.0 h1:Hbg2NidpLE8veEBkEZTL3CvlkUIVzuU9jDplZO54c48=
github.com/stretchr/objx v0.2.0/go.mod h1:qt09Ya8vawLte6SNmTgCsAVtYtaKzEcn8ATUoHMkEqE=
github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs=
github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
github.com/stretchr/testify v1.4.0 h1:2E4SXV/wtOkTonXsotYi4li6zVWxYlZuYNCXe9XRJyk=
github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4=
github.com/tidwall/pretty v1.0.0/go.mod h1:XNkn88O1ChpSDQmQeStsy+sBenx6DDtFZJxhVysOjyk=
github.com/tmc/grpc-websocket-proxy v0.0.0-20190109142713-0ad062ec5ee5/go.mod h1:ncp9v5uamzpCO7NfCPTXjqaC+bZgJeR0sMTm6dMHP7U=
github.com/ugorji/go v1.1.4/go.mod h1:uQMGLiO92mf5W77hV/PUCpI3pbzQx3CRekS0kk+RGrc=
github.com/vektah/gqlparser v1.1.2/go.mod h1:1ycwN7Ij5njmMkPPAOaRFY4rET2Enx7IkVv3vaXspKw=
github.com/xiang90/probing v0.0.0-20190116061207-43a291ad63a2/go.mod h1:UETIi67q53MR2AWcXfiuqkDkRtnGDLqkBTpCHuJHxtU=
github.com/xlab/treeprint v0.0.0-20181112141820-a009c3971eca h1:1CFlNzQhALwjS9mBAUkycX616GzgsuYUOCHA5+HSlXI=
github.com/xlab/treeprint v0.0.0-20181112141820-a009c3971eca/go.mod h1:ce1O1j6UtZfjr22oyGxGLbauSBp2YVXpARAosm7dHBg=
github.com/xordataexchange/crypt v0.0.3-0.20170626215501-b2862e3d0a77/go.mod h1:aYKd//L2LvnjZzWKhF00oedf4jCCReLcmhLdhm1A27Q=
go.etcd.io/bbolt v1.3.2/go.mod h1:IbVyRI1SCnLcuJnV2u8VeU0CEYM7e686BmAb1XKL+uU=
go.mongodb.org/mongo-driver v1.0.3/go.mod h1:u7ryQJ+DOzQmeO7zB6MHyr8jkEQvC8vH7qLUO4lqsUM=
go.mongodb.org/mongo-driver v1.1.1/go.mod h1:u7ryQJ+DOzQmeO7zB6MHyr8jkEQvC8vH7qLUO4lqsUM=
go.mongodb.org/mongo-driver v1.1.2/go.mod h1:u7ryQJ+DOzQmeO7zB6MHyr8jkEQvC8vH7qLUO4lqsUM=
go.starlark.net v0.0.0-20200306205701-8dd3e2ee1dd5/go.mod h1:nmDLcffg48OtT/PSW0Hg7FvpRQsQh5OSqIylirxKC7o=
go.uber.org/atomic v1.4.0/go.mod h1:gD2HeocX3+yG+ygLZcrzQJaqmWj9AIm7n08wl/qW/PE=
go.uber.org/multierr v1.1.0/go.mod h1:wR5kodmAFQ0UK8QlbwjlSNy0Z68gJhDJUG5sjR94q/0=
go.uber.org/zap v1.10.0/go.mod h1:vwi/ZaCAaUcBkycHslxD9B2zi4UTXhF60s6SWpuDF0Q=
golang.org/x/crypto v0.0.0-20180904163835-0709b304e793/go.mod h1:6SG95UA2DQfeDnfUPMdvaQW0Q7yPrPDi9nlGo2tz2b4=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20190320223903-b7391e95e576/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20190611184440-5c40567a22f8/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20190617133340-57b3e21c3d56/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/lint v0.0.0-20181026193005-c67002cb31c3/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE=
golang.org/x/lint v0.0.0-20190313153728-d0100b6bd8b3/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc=
golang.org/x/net v0.0.0-20180826012351-8a410e7b638d/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20181005035420-146acd28ed58/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20181114220301-adae6a3d119a/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20181220203305-927f97764cc3/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20190311183353-d8887717615a/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190320064053-1272bf9dcd53/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190522155817-f3200d17e092/go.mod h1:HSz+uSET+XFnRR8LxR5pz3Of3rY3CfYBVs4xY44aLks=
golang.org/x/net v0.0.0-20190613194153-d28f0bde5980/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20190827160401-ba9fcec4b297 h1:k7pJ2yAPLPgbskkFdhRCsA77k2fySZ1zf2zCjvQCiIM=
golang.org/x/net v0.0.0-20190827160401-ba9fcec4b297/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U=
golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20181108010431-42b317875d0f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20180905080454-ebe1bf3edb33/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20181107165924-66b7b1311ac8/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20181116152217-5ac8a444bdc5/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190321052220-f7bb7a8bee54/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190616124812-15dcb6c0061f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191002063906-3421d5a6bb1c/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.2 h1:tW2bmiBqwgJj/UpqtC8EpXEZVYOwU0yG4iWbprSVAcs=
golang.org/x/text v0.3.2/go.mod h1:bEr9sfX3Q8Zfm5fL9x+3itogRgK3+ptLWKqgva+5dAk=
golang.org/x/time v0.0.0-20190308202827-9d24e82272b4/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/tools v0.0.0-20180221164845-07fd8470d635/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20190114222345-bf090417da8b/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20190125232054-d66bd3c5d5a6/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20190311212946-11955173bddd/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs=
golang.org/x/tools v0.0.0-20190614205625-5aca471b1d59/go.mod h1:/rFqwRUd4F7ZHNgwSSTFct+R/Kf4OFW1sUzUTQQTgfc=
golang.org/x/tools v0.0.0-20190617190820-da514acc4774/go.mod h1:/rFqwRUd4F7ZHNgwSSTFct+R/Kf4OFW1sUzUTQQTgfc=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
google.golang.org/appengine v1.1.0/go.mod h1:EbEs0AVv82hx2wNQdGPgUI5lhzA/G0D9YwlJXL52JkM=
google.golang.org/genproto v0.0.0-20180817151627-c66870c02cf8/go.mod h1:JiN7NxoALGmiZfu7CAH4rXhgtRTLTxftemlI0sWmxmc=
google.golang.org/grpc v1.19.0/go.mod h1:mqu4LbDTu4XGKhr4mRzUsmM4RtVoemTSY81AxZiDr8c=
google.golang.org/grpc v1.21.0/go.mod h1:oYelfM1adQP15Ek0mdvEgi9Df8B9CZIaU1084ijfRaM=
gopkg.in/alecthomas/kingpin.v2 v2.2.6/go.mod h1:FMv+mEhP44yOT+4EoQTLFTRgOQ1FBLkstjWtayDeSgw=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15 h1:YR8cESwS4TdDjEe65xsg0ogRM/Nc3DYOhEAlW+xobZo=
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/resty.v1 v1.12.0/go.mod h1:mDo4pnntr5jdWRML875a/NmxYqAlA73dVijT2AXvQQo=
gopkg.in/yaml.v2 v2.0.0-20170812160011-eb3733d160e7/go.mod h1:JAlM8MvJe8wmxCU4Bli9HhUf9+ttbYbLASfIpnQbh74=
gopkg.in/yaml.v2 v2.2.1/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.4/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.7/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.8/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.4.0 h1:D8xgwECY7CYvx+Y2n4sBz93Jn9JRvxdiyyo8CTfuKaY=
gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c h1:dUUwHk2QECo/6vqA44rthZ8ie2QXMNeKRTHCNY2nXvo=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
honnef.co/go/tools v0.0.0-20190102054323-c2f93a96b099/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4=
sigs.k8s.io/kustomize/kyaml v0.10.17 h1:4zrV0ym5AYa0e512q7K3Wp1u7mzoWW0xR3UHJcGWGIg=
sigs.k8s.io/kustomize/kyaml v0.10.17/go.mod h1:mlQFagmkm1P+W4lZJbJ/yaxMd8PqMRSC4cPcfUVt5Hg=
sigs.k8s.io/yaml v1.2.0 h1:kr/MCeFWJWTwyaHoR9c8EjH9OumOmoF9YGiZd7lFm/Q=
sigs.k8s.io/yaml v1.2.0/go.mod h1:yfXDCHCao9+ENCvLSE62v9VSji2MKu5jeNfTrofGhJc=

View File

@@ -1,23 +0,0 @@
package httpclient
import (
"net/http"
"time"
"github.com/gomodule/redigo/redis"
"github.com/gregjones/httpcache"
rediscache "github.com/gregjones/httpcache/redis"
)
func FromCache(header http.Header) bool {
return header.Get(httpcache.XFromCache) != ""
}
func NewClient(conn redis.Conn) *http.Client {
etagCache := rediscache.NewWithClient(conn)
tr := httpcache.NewTransport(etagCache)
return &http.Client{
Transport: tr,
Timeout: 10 * time.Second,
}
}

View File

@@ -1,331 +0,0 @@
package index
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"time"
es "github.com/elastic/go-elasticsearch/v6"
"github.com/elastic/go-elasticsearch/v6/esapi"
)
const IndexConfig = `
{
"mappings": {
"_doc": {
"properties": {
"repositoryUrl": {
"type": "keyword"
},
"user": {
"type": "keyword"
},
"filePath": {
"type": "keyword"
},
"defaultBranch": {
"type": "keyword"
},
"fileType": {
"type": "keyword"
},
"document": {
"type": "text"
},
"creationTime": {
"type": "date"
},
"kinds": {
"type": "text"
},
"identifiers": {
"type": "text"
},
"values": {
"type": "text"
}
}
}
}
}`
// TODO(damienr74) Split index into reader and writer?
type index struct {
ctx context.Context
client *es.Client
name string
}
func newIndex(ctx context.Context, name string) (*index, error) {
client, err := es.NewDefaultClient()
if err != nil {
return nil, err
}
return &index{
ctx: ctx,
client: client,
name: name,
}, nil
}
type readerFunc func(io.Reader) error
func ignoreResponseBody(_ io.Reader) error {
return nil
}
// checks that elastic returned successfully. If it has not, it will read the
// body and return it in an error message.
//
// Otherwise, it will use the readerFunc to read the body. This function is a
// mechanism for getting relevant data from the response only if it was successful.
func (idx *index) responseErrorOrNil(info string, res *esapi.Response,
err error, reader readerFunc) error {
messageStart := fmt.Sprintf("index %s error: %s", idx.name, info)
if err != nil || res == nil {
return fmt.Errorf("%s: %v", messageStart, err)
}
defer res.Body.Close()
if res.IsError() {
return fmt.Errorf("%s: %s [%d]", messageStart, res.String(), res.StatusCode)
}
if reader != nil {
err = reader(res.Body)
if err != nil {
return fmt.Errorf("%s: %v", messageStart, err)
}
}
return nil
}
func byteJoin(bts ...interface{}) []byte {
ret := make([][]byte, len(bts))
for i, v := range bts {
switch bt := v.(type) {
case []byte:
ret[i] = bt
case string:
ret[i] = []byte(bt)
default:
ret[i] = []byte(fmt.Sprintf("%v", bt))
}
}
return bytes.Join(ret, []byte(` `))
}
// Update the elasticsearch index mappings. (describes how to index/search for the documents).
func (idx *index) UpdateMapping(mappings []byte) error {
request := byteJoin(`{ "mappings":`, mappings, `}`)
op := idx.client.Indices.PutMapping
res, err := op(
bytes.NewReader(request),
op.WithContext(idx.ctx),
op.WithIndex(idx.name),
op.WithIncludeTypeName(true),
op.WithPretty(),
)
return idx.responseErrorOrNil(
fmt.Sprintf("could not update index mappings '%s'", request),
res, err, ignoreResponseBody)
}
// Update the elasticsearch index settings. (describes default parameters and
// some analyzer definitions, etc.)
func (idx *index) UpdateSetting(settings []byte) error {
request := byteJoin(`{ "settings": `, settings, `}`)
op := idx.client.Indices.PutSettings
res, err := op(
bytes.NewReader(request),
op.WithContext(idx.ctx),
op.WithIndex(idx.name),
op.WithPretty(),
)
return idx.responseErrorOrNil(
fmt.Sprintf("could not update index settings '%s'", request),
res, err, ignoreResponseBody)
}
// Create an index providing the config for both the mappings and the settings.
func (idx *index) CreateIndex(config []byte) error {
op := idx.client.Indices.Create
res, err := op(
idx.name,
op.WithBody(bytes.NewReader(config)),
op.WithContext(idx.ctx),
op.WithHuman(),
op.WithPretty(),
)
return idx.responseErrorOrNil(
fmt.Sprintf("could not create index with config '%s'", config),
res, err, ignoreResponseBody)
}
// Delete an index.
func (idx *index) DeleteIndex() error {
res, err := idx.client.Indices.Delete(
[]string{idx.name},
)
return idx.responseErrorOrNil("could not delete index",
res, err, ignoreResponseBody)
}
// Insert or update the document by ID.
func (idx *index) Put(uniqueID string, doc interface{}) error {
exists, err := idx.Exists(uniqueID)
if err != nil {
return err
}
if exists {
var docBytes []byte
docBytes, err = json.Marshal(doc)
if err != nil {
return err
}
body := byteJoin(`{"doc":`, docBytes, `}`)
// For a document with a given id, every call of IndexRequest.Do will increase the version of a document.
// To avoid increasing the document version unnecessarily, use UpdateRequest here.
req := esapi.UpdateRequest{
Index: idx.name,
Body: bytes.NewReader(body),
DocumentID: uniqueID,
}
var res *esapi.Response
res, err = req.Do(idx.ctx, idx.client)
err = idx.responseErrorOrNil("could not update document",
res, err, ignoreResponseBody)
} else {
var body []byte
body, err = json.Marshal(doc)
if err != nil {
return err
}
req := esapi.IndexRequest{
Index: idx.name,
Body: bytes.NewReader(body),
DocumentID: uniqueID,
}
var res *esapi.Response
res, err = req.Do(idx.ctx, idx.client)
err = idx.responseErrorOrNil("could not insert document",
res, err, ignoreResponseBody)
}
return err
}
type scrollUpdater func(string, readerFunc) error
// Update the scroll for iteration. If no scroll exists, create one.
func (idx *index) scrollUpdater(query []byte, batchSize int,
timeout time.Duration) scrollUpdater {
return func(scrollID string, reader readerFunc) error {
var res *esapi.Response
var err error
if scrollID == "" {
search := idx.client.Search
res, err = search(
search.WithContext(idx.ctx),
search.WithIndex(idx.name),
search.WithBody(bytes.NewBuffer(query)),
search.WithScroll(timeout),
search.WithSize(batchSize),
)
} else {
scroll := idx.client.Scroll
res, err = scroll(
scroll.WithContext(idx.ctx),
scroll.WithScroll(timeout),
scroll.WithScrollID(scrollID),
)
}
return idx.responseErrorOrNil(
fmt.Sprintf("could not scroll for query %s", query),
res, err, reader)
}
}
// Simple search options. Size is the number of elements to return, From is the
// rank of the results according to the query. Used as a simple (stateless)
// pagination technique.
type SearchOptions struct {
Size int
From int
}
// Search for a query (json query dsl) with some options, and use the reader func
// to extract the response.
func (idx *index) Search(query []byte, opts SearchOptions,
responseReader readerFunc) error {
op := idx.client.Search
res, err := op(
op.WithContext(idx.ctx),
op.WithIndex(idx.name),
op.WithBody(bytes.NewBuffer(query)),
op.WithTrackTotalHits(true),
op.WithSize(opts.Size),
op.WithFrom(opts.From),
op.WithPretty(),
)
return idx.responseErrorOrNil(
fmt.Sprintf("could not complete search query %v", query),
res, err, responseReader)
}
// Delete an element from elasticsearch by Id.
func (idx *index) Delete(id string) error {
op := idx.client.Delete
res, err := op(
idx.name,
id,
op.WithContext(idx.ctx),
op.WithPretty(),
)
return idx.responseErrorOrNil(
fmt.Sprintf("could not delete id(%s) from index(%s)", id, idx.name),
res, err, ignoreResponseBody)
}
// Check whether a given document id is in the index
func (idx *index) Exists(id string) (bool, error) {
op := idx.client.Exists
res, err := op(
idx.name,
id,
op.WithContext(idx.ctx),
op.WithPretty(),
)
if res != nil && !res.IsError() {
return true, nil
} else if res != nil && res.StatusCode == 404 {
return false, nil
} else {
return false, idx.responseErrorOrNil(
fmt.Sprintf("could not check the existence of id(%s) from index(%s)", id, idx.name),
res, err, ignoreResponseBody)
}
}

View File

@@ -1,364 +0,0 @@
package index
import (
"context"
"encoding/json"
"fmt"
"io"
"io/ioutil"
"log"
"strings"
"time"
"sigs.k8s.io/kustomize/api/internal/crawl/doc"
)
const (
AggregationKeyword = "aggs"
)
type Mode int
const (
InsertOrUpdate = iota
Delete
)
// Redefinition of Hits structure. Must match the json string of
// KustomizeResult.Hits.Hits. Declared as a convenience for iteration.
type KustomizeHits []struct {
ID string `json:"id"`
Document doc.KustomizationDocument `json:"result"`
}
type KustomizeResult struct {
ScrollID *string `json:"-"`
Hits *struct {
Total int `json:"total"`
Hits []struct {
ID string `json:"id"`
Document doc.KustomizationDocument `json:"result"`
} `json:"hits"`
} `json:"hits,omitempty"`
Aggregations *struct {
Timeseries *struct {
Buckets []struct {
Key string `json:"key"`
Count int `json:"count"`
} `json:"buckets"`
} `json:"timeseries,omitempty"`
Kinds *struct {
OtherCount int `json:"otherResults"`
Buckets []struct {
Key string `json:"key"`
Count int `json:"count"`
} `json:"buckets"`
} `json:"kinds,omitempty"`
} `json:"aggregations,omitempty"`
}
// Elasticsearch has some sometimes inconsistent labels, and some pretty ugly label choices.
// However, the structure seems reasonable, so I wanted to use it if possible. This method
// needs two copies of the types to make the json strings different. The Copies must be the
// exact same type/structure, so the types must be declared inline. Go will check that these
// are convertible at compile time, and converting at runtime is a noop.
type ElasticKustomizeResult struct {
ScrollID *string `json:"_scroll_id,omitempty"`
Hits *struct {
Total int `json:"total"`
Hits []struct {
ID string `json:"_id"`
Document doc.KustomizationDocument `json:"_source"`
} `json:"hits"`
} `json:"hits,omitempty"`
Aggregations *struct {
Timeseries *struct {
Buckets []struct {
Key string `json:"key_as_string"`
Count int `json:"doc_count"`
}
} `json:"timeseries,omitempty"`
Kinds *struct {
OtherCount int `json:"sum_other_doc_count"`
Buckets []struct {
Key string `json:"key"`
Count int `json:"doc_count"`
}
} `json:"kinds,omitempty"`
} `json:"aggregations,omitempty"`
}
type KustomizeIndex struct {
*index
}
// Create index reference to the index containing the kustomize documents.
func NewKustomizeIndex(ctx context.Context, indexName string) (*KustomizeIndex, error) {
idx, err := newIndex(ctx, indexName)
if err != nil {
return nil, err
}
indicesExistsOp := idx.client.Indices.Exists
resp, err := indicesExistsOp([]string{indexName},
indicesExistsOp.WithContext(idx.ctx),
indicesExistsOp.WithPretty())
if err != nil {
return nil, err
}
if resp.StatusCode == 200 {
log.Printf("The %s index already exists", indexName)
} else {
log.Printf("Creating the %s index\n", indexName)
if err := idx.CreateIndex([]byte(IndexConfig)); err != nil {
return nil, err
}
}
return &KustomizeIndex{idx}, nil
}
// Return a timeseries of kustomization file counts.
func TimeseriesAggregation() (string, map[string]interface{}) {
return "timeseries", map[string]interface{}{
"date_histogram": map[string]interface{}{
"field": "creationTime",
"interval": "day",
/// XXX Only return values with counts, otherwise
// every day is added to the output...
// This matters if ever a zero valued time would
// be stored in the creationTime field... it would
// return >600k entries (for every day since year 0).
// IDK why this is default, but I would not want this
// to happen...
"min_doc_count": 1,
},
}
}
// Return aggregation of results based off of their kinds.
func KindAggregation(maxBuckets int) (string, map[string]interface{}) {
if maxBuckets < 1 {
maxBuckets = 1
}
return "kinds", map[string]interface{}{
"terms": map[string]interface{}{
"field": "kinds.keyword",
"size": maxBuckets,
},
}
}
// The multi_match search type in elasticsearch will check each field according
// to their respective analyzers for the identifier.
func multiMatch(query string) map[string]interface{} {
return map[string]interface{}{
"multi_match": map[string]interface{}{
"type": "cross_fields",
"fields": []string{
"values.keyword^3",
"identifiers.keyword^3",
"values.ngram",
"identifiers.ngram",
// TODO(damienr74) remove document with default
// analyzer. It does not handle special (=,: etc)
// characters properly, and matches with false
// positives. document.whitespace does not exist
// yet, but should use the whitespace analyzer.
"document",
"document.whitespace",
},
"query": query,
},
}
}
// Build an elasticsearch query from a user query.
func BuildQuery(query string) map[string]interface{} {
queryTokens := strings.Fields(query)
if len(queryTokens) == 0 {
return map[string]interface{}{
"size": 0,
}
}
mustMatch := make([]map[string]interface{}, len(queryTokens))
for i, tok := range queryTokens {
if strings.HasPrefix(strings.ToLower(tok), "kind=") {
mustMatch[i] = map[string]interface{}{
"term": map[string]interface{}{
"kinds.keyword": tok[5:],
},
}
continue
}
mustMatch[i] = multiMatch(tok)
}
structuredQuery := map[string]interface{}{
"query": map[string]interface{}{
"bool": map[string]interface{}{
"must": mustMatch,
},
},
}
return structuredQuery
}
// Iterator based off of the way bufio.Scanner works.
//
// Example:
// for it.Next() {
// for _, doc := range it.Value().Hits {
// // Handle KustomizationDocument.
// }
// }
//
// if err := it.Err(); err != nil {
// // Handle err.
// }
type KustomizeIterator struct {
update scrollUpdater
err error
// Matches the return definition of elasticsearch search results. The
// scroll ID is practically a database cursor.
scrollImpl KustomizeResult
}
// Get the next batch of results. Note that this returns multiple results that
// can be iterated.
func (it *KustomizeIterator) Next() bool {
reader := func(reader io.Reader) error {
data, err := ioutil.ReadAll(reader)
if err != nil {
return fmt.Errorf("could not read from body: %v", err)
}
var scrollInput ElasticKustomizeResult
err = json.Unmarshal(data, &scrollInput)
if err != nil {
return fmt.Errorf("cloud not marshal %s into %T: %v",
data, scrollInput, err)
}
it.scrollImpl = KustomizeResult(scrollInput)
return nil
}
if it.err == nil {
log.Printf("updating scroll: %s\n", *it.scrollImpl.ScrollID)
it.err = it.update(*it.scrollImpl.ScrollID, reader)
}
// if there is no error and the array is not empty, then Value is
// obligated to return a valid result.
return it.err == nil &&
it.scrollImpl.Hits != nil &&
len(it.scrollImpl.Hits.Hits) > 0
}
// Get the value from this batch of iterations.
func (it *KustomizeIterator) Value() KustomizeResult {
return it.scrollImpl
}
// Check if any errors have occurred.
func (it *KustomizeIterator) Err() error {
return it.err
}
// Create an iterator over query. Iterate in chunks of batchSize, each batch
// should take no longer than timeout to read (otherwise, elasticsearch will
// delete the context).
//
// XXX Important to set a reasonable amount of time to read the documents. If
// a lot of processing must be done, consider loading everything in memory
// before doing it so that, a short timeout period can be set. Scrolling creates
// a consistent DB context, so this can be costly.
//
// Scrolling is also not meant to be used for real time purposes. If you need
// results quickly, consider using the From: field in SearchOptions and a normal
// search. This will not guarantee that the values will not change but is more
// suitable for lower latencies/long execution timeouts.
func (ki *KustomizeIndex) IterateQuery(query []byte, batchSize int,
timeout time.Duration) *KustomizeIterator {
emptyScroll := ""
return &KustomizeIterator{
update: ki.scrollUpdater(query, batchSize, timeout),
scrollImpl: KustomizeResult{
ScrollID: &emptyScroll,
},
}
}
// type specific Put for inserting structured kustomization documents.
func (ki *KustomizeIndex) Put(id string, doc *doc.KustomizationDocument) error {
return ki.index.Put(id, doc)
}
// Delete a document with a given id from the kustomize index.
func (ki *KustomizeIndex) Delete(id string) error {
return ki.index.Delete(id)
}
// Kustomize search options: What metrics should be returned? Kind Aggregation,
// TimeseriesAggregation, etc. Also embedds the SearchOptions field to specify
// the position in the sorted list of results and the number of results to return.
type KustomizeSearchOptions struct {
SearchOptions
KindAggregation bool
TimeseriesAggregation bool
}
// Search the index with the given query string. Returns a structured result and possible
// aggregates.
func (ki *KustomizeIndex) Search(query string,
opts KustomizeSearchOptions) (*KustomizeResult, error) {
aggMap := make(map[string]interface{})
if opts.KindAggregation {
k, kAgg := KindAggregation(15)
aggMap[k] = kAgg
}
if opts.TimeseriesAggregation {
t, tAgg := TimeseriesAggregation()
aggMap[t] = tAgg
}
esQuery := BuildQuery(query)
if len(aggMap) > 0 {
esQuery[AggregationKeyword] = aggMap
}
data, err := json.Marshal(&esQuery)
if err != nil {
return nil, fmt.Errorf("failed to format query %s", query)
}
log.Printf("formated query: %s\n", data)
var kr ElasticKustomizeResult
err = ki.index.Search(data, opts.SearchOptions, func(results io.Reader) error {
data, err = ioutil.ReadAll(results)
if err != nil {
return fmt.Errorf("could not read results from search: %v", err)
}
if err = json.Unmarshal(data, &kr); err != nil {
return fmt.Errorf("could not parse results from search: %v", err)
}
return nil
})
res := KustomizeResult(kr)
return &res, err
}

View File

@@ -1,72 +0,0 @@
package index
import (
"reflect"
"testing"
)
func TestBuildQuery(t *testing.T) {
testCases := []struct {
query string
result map[string]interface{}
}{
{
query: " \t\n\r",
result: map[string]interface{}{"size": 0},
},
{
query: "\tidentifier1 identifier2\nidentifier3\r",
result: map[string]interface{}{
"query": map[string]interface{}{
"bool": map[string]interface{}{
"must": []map[string]interface{}{
multiMatch("identifier1"),
multiMatch("identifier2"),
multiMatch("identifier3"),
},
},
},
},
},
{
query: "kind=Kustomization",
result: map[string]interface{}{
"query": map[string]interface{}{
"bool": map[string]interface{}{
"must": []map[string]interface{}{
{
"term": map[string]interface{}{
"kinds.keyword": "Kustomization",
},
},
},
},
},
},
},
{
query: "kind=Kustomization identifier2",
result: map[string]interface{}{
"query": map[string]interface{}{
"bool": map[string]interface{}{
"must": []map[string]interface{}{
{
"term": map[string]interface{}{
"kinds.keyword": "Kustomization",
},
},
multiMatch("identifier2"),
},
},
},
},
},
}
for _, tc := range testCases {
result := BuildQuery(tc.query)
if !reflect.DeepEqual(tc.result, result) {
t.Errorf("Expected %#v to match %#v", result, tc.result)
}
}
}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 32 KiB

View File

@@ -1,411 +0,0 @@
Find out the largest value of the `creationTime` field:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"max_creationTime" : { "max" : { "field" : "creationTime" } }
}
}
'
```
Find out the smallest value of the `creationTime` field:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"min_creationTime" : { "min" : { "field" : "creationTime" } }
}
}
'
```
Find out the smallest value of the `creationTime` field of all the kustomization files:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"min_creationTime" : { "min" : { "field" : "creationTime" } }
}
}
'
```
Find out the smallest value of the `creationTime` field of all kustomize resource files:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"filter": [
{ "regexp": { "fileType": "resource" }}
]
}
},
"aggs" : {
"min_creationTime" : { "min" : { "field" : "creationTime" } }
}
}
'
```
Find out the smallest value of the `creationTime` field of all kustomize generator files:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"filter": [
{ "regexp": { "fileType": "generator" }}
]
}
},
"aggs" : {
"min_creationTime" : { "min" : { "field" : "creationTime" } }
}
}
'
```
Find out the smallest value of the `creationTime` field of all kustomize transformer files:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"filter": [
{ "regexp": { "fileType": "transformer" }}
]
}
},
"aggs" : {
"min_creationTime" : { "min" : { "field" : "creationTime" } }
}
}
'
```
Query all the documents whose `creationTime` <= `2016-07-29T17:38:26.000Z`:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"range": {
"creationTime": {
"lte": "2016-07-29T17:38:26.000Z"
}
}
}
}
'
```
Query all the documents whose `creationTime` falls within the specific range:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"range": {
"creationTime": {
"gte": "2016-07-29T17:38:26.000Z",
"lte": "2016-08-29T17:38:26.000Z"
}
}
}
}
'
```
Query all the kustomization files whose `creationTime` falls within the specific range:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 20,
"query": {
"bool": {
"filter": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"must": {
"range": {
"creationTime": {
"gte": "2017-09-24T15:49:57.000Z",
"lte": "2017-09-24T15:49:57.000Z"
}
}
}
}
}
}
'
```
Aggregate how many new kustomization files were added into Github each month:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"newFiles_over_time" : {
"date_histogram" : {
"field" : "creationTime",
"interval" : "month"
}
}
}
}
'
```
Aggregate how many new kustomize resource files were added into Github each month:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
],
"filter": [
{ "regexp": { "fileType": "resource" }}
]
}
},
"aggs" : {
"newFiles_over_time" : {
"date_histogram" : {
"field" : "creationTime",
"interval" : "month"
}
}
}
}
'
```
Aggregate how many new kustomize generator files were added into Github each month:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
],
"filter": [
{ "regexp": { "fileType": "generator" }}
]
}
},
"aggs" : {
"newFiles_over_time" : {
"date_histogram" : {
"field" : "creationTime",
"interval" : "month"
}
}
}
}
'
```
Aggregate how many new kustomize transformer files were added into Github each month:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
],
"filter": [
{ "regexp": { "fileType": "transformer" }}
]
}
},
"aggs" : {
"newFiles_over_time" : {
"date_histogram" : {
"field" : "creationTime",
"interval" : "month"
}
}
}
}
'
```
Aggregate how many new kustomization files were added into Github each year:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"newFiles_over_time" : {
"date_histogram" : {
"field" : "creationTime",
"interval" : "year"
}
}
}
}
'
```
Aggregate how many new kustomize resource files were added into Github each year:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
],
"filter": [
{ "regexp": { "fileType": "resource" }}
]
}
},
"aggs" : {
"newFiles_over_time" : {
"date_histogram" : {
"field" : "creationTime",
"interval" : "year"
}
}
}
}
'
```
Aggregate how many new kustomize generator files were added into Github each year:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
],
"filter": [
{ "regexp": { "fileType": "generator" }}
]
}
},
"aggs" : {
"newFiles_over_time" : {
"date_histogram" : {
"field" : "creationTime",
"interval" : "year"
}
}
}
}
'
```
Aggregate how many new kustomize transformer files were added into Github each year:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
],
"filter": [
{ "regexp": { "fileType": "transformer" }}
]
}
},
"aggs" : {
"newFiles_over_time" : {
"date_histogram" : {
"field" : "creationTime",
"interval" : "year"
}
}
}
}
'
```
Find the generator files created within the given time range:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "generator" }}
],
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"must": {
"range": {
"creationTime": {
"gte": "2019-04-26T16:40:02.000Z",
"lte": "2019-04-26T16:40:02.000Z"
}
}
}
}
}
}
'
```
Find the transformer files created within the given time range:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "transformer" }}
],
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"must": {
"range": {
"creationTime": {
"gte": "2019-04-26T16:40:02.000Z",
"lte": "2019-04-26T16:40:02.000Z"
}
}
}
}
}
}
'
```

View File

@@ -1,32 +0,0 @@
Count distinct values of the `defaultBranch` field:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"defaultBranch_count" : {
"cardinality" : {
"field" : "defaultBranch",
"precision_threshold": 40000
}
}
}
}
'
```
List all the github branches where kustomization files and kustomize resource files live,
and how many kustomization files and kustomize resource files live in each branch:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"defaultBranch" : {
"terms" : {
"field" : "defaultBranch",
"size": 41
}
}
}
}
'
```

View File

@@ -1,55 +0,0 @@
Count the documents whose `document` field is empty (The reason why the `document` field
of a document is empty is because of empty documents):
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 10000,
"query": {
"bool": {
"must_not": {
"exists": {
"field": "document"
}
}
}
}
}
'
```
Find all the documents having the `creationTime` field set:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"exists": {
"field": "creationTime"
}
}
}
'
```
Find all the documents whose `creationTime` field is not set:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 10000,
"query": {
"bool": {
"must_not": {
"exists": {
"field": "creationTime"
}
}
}
}
}
'
```
The following fields of a document in the kustomize index are always non-empty:
`repositoryUrl`, `filePath`, `defaultBranch`.
The following fields of a document in the kustomize index may be empty:
`kinds`, `identifiers`, `values`.

View File

@@ -1,301 +0,0 @@
Find all the documents having the `fileType` field set:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"exists": {
"field": "fileType"
}
}
}
'
```
Find all the documents whose `fileType` field is not set:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 10000,
"query": {
"bool": {
"must_not": {
"exists": {
"field": "fileType"
}
}
}
}
}
'
```
Search for all the documents whose `fileType` field is `resource`:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "resource" }}
]
}
}
}
'
```
Search for all the kustomization files whose `fileType` field is `resource`:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }},
{ "regexp": { "fileType": "resource" }}
]
}
}
}
'
```
Search for all the kustomize resource files:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "resource" }}
],
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
}
}
}
}
'
```
Search all the kustomization files including a `generators` field:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 10000,
"query": {
"bool": {
"must": {
"match" : {
"identifiers" : {
"query" : "generators"
}
}
},
"filter": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
}
}
}
}
'
```
Search for all the documents whose `fileType` field is `generator`:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "generator" }}
]
}
}
}
'
```
Search for all the kustomization files whose `fileType` field is `generator`:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }},
{ "regexp": { "fileType": "generator" }}
]
}
}
}
'
```
Search for all the kustomize generator files:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "generator" }}
],
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
}
}
}
}
'
```
Search all the kustomization files including a `transformers` field:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 10000,
"query": {
"bool": {
"must": {
"match" : {
"identifiers" : {
"query" : "transformers"
}
}
},
"filter": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
}
}
}
}
'
```
Search for all the documents whose `fileType` field is `transformer`:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "transformer" }}
]
}
}
}
'
```
Search for all the kustomization files whose `fileType` field is `transformer`:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }},
{ "regexp": { "fileType": "transformer" }}
]
}
}
}
'
```
Search for all the kustomize transformer files:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "transformer" }}
],
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
}
}
}
}
'
```
Count distinct values of the `fileType` field:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"fileType_count" : {
"cardinality" : {
"field" : "fileType",
"precision_threshold": 40000
}
}
}
}
'
```
List all the values of the `fileType` field and the frequency of each value:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"fileType" : {
"terms" : {
"field" : "fileType"
}
}
}
}
'
```
For all the kustomization files in the index, list all the values of the
`fileType` field and the frequency of each value:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"fileType" : {
"terms" : {
"field" : "fileType"
}
}
}
}
'
```
For all the non-kustomization files in the index, list all the values of the
`fileType` field and the frequency of each value:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
}
}
},
"aggs" : {
"fileType" : {
"terms" : {
"field" : "fileType"
}
}
}
}
'
```

View File

@@ -1,29 +0,0 @@
Find all the generator files whose `kinds` field includes `ChartRenderer`, and
only output certain fields of each document:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 200,
"_source": {
"includes": ["kinds", "repositoryUrl", "defaultBranch", "filePath"]
},
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "generator" }}
],
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"must": {
"match" : {
"kinds" : {
"query" : "ChartRenderer"
}
}
}
}
}
}
'
```

View File

@@ -1,12 +0,0 @@
Find the document with the given `_id`:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"terms": {
"_id": [ "b3a03f3327841617db696e2d6abc30e1a1bd653f1a2bbce05637f7dcae1a43f7" ]
}
}
}
'
```

View File

@@ -1,82 +0,0 @@
Count the documents in the index whose `repositoryUrl` field starts with
`https://github.com/`:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "repositoryUrl": "https://github.com/.*" }}
]
}
}
}
'
```
Count the documents in the index whose `repositoryUrl` field does not start with
`https://github.com/`:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": [
{ "regexp": { "repositoryUrl": "https://github.com/.*" }}
]
}
}
}
'
```
Search all the documents matching the given `repositoryUrl` and `filePath`, and return
a version for each search hit:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 10000,
"version": true,
"query": {
"bool": {
"filter": [
{ "regexp": { "repositoryUrl": "git@github.com:talos-systems/talos-controller-manager" }},
{ "regexp": { "filePath": "hack/config.*" }}
]
}
}
}
'
```
Search all the documents whose filePath ends with one of these following three filenames:
`kustomization.yaml`, `kustomization.yml`, `kustomization`:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
}
}
'
```
Search all the documents whose filePath does not end with any of these following
three filenames: `kustomization.yaml`, `kustomization.yml`, `kustomization`:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
}
}
'
```

View File

@@ -1,32 +0,0 @@
Check the health status of an ElasticSearch cluster:
```
curl -s -X GET "${ElasticSearchURL}:9200/_cat/health?v&pretty"
```
Check the indices in an ElasticSearch cluster:
```
curl -s "${ElasticSearchURL}:9200/_cat/indices?v"
```
Get the mapping of the index:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_mapping?pretty"
```
Delete the kustomize index from the ElasticSearch cluster (**Use this command with caution**):
```
curl -s -X DELETE "${ElasticSearchURL}:9200/${INDEXNAME}?pretty"
```
Add a new field into an existing index.
```
curl -s -X PUT "${ElasticSearchURL}:9200/${INDEXNAME}/_mapping/_doc?pretty" -H 'Content-Type: application/json' -d'
{
"properties": {
"fileType": {
"type": "keyword"
}
}
}
'
```

View File

@@ -1,255 +0,0 @@
Count distinct values of the `repositoryUrl` field:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"repositoryUrl_count" : {
"cardinality" : {
"field" : "repositoryUrl",
"precision_threshold": 40000
}
}
}
}
'
```
Count how many Github repositories include kustomization files:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"repositoryUrl_count" : {
"cardinality" : {
"field" : "repositoryUrl",
"precision_threshold": 40000
}
}
}
}
'
```
Count distinct values of the `repositoryUrl` field for all the kustomize resource files in the index:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"filter": [
{ "regexp": { "fileType": "resource" }}
]
}
},
"aggs" : {
"repositoryUrl_count" : {
"cardinality" : {
"field" : "repositoryUrl",
"precision_threshold": 40000
}
}
}
}
'
```
Count distinct values of the `repositoryUrl` field for all the kustomize generator files in the index:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"filter": [
{ "regexp": { "fileType": "generator" }}
]
}
},
"aggs" : {
"repositoryUrl_count" : {
"cardinality" : {
"field" : "repositoryUrl",
"precision_threshold": 40000
}
}
}
}
'
```
Count distinct values of the `repositoryUrl` field for all the kustomize transformer files in the index:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"filter": [
{ "regexp": { "fileType": "transformer" }}
]
}
},
"aggs" : {
"repositoryUrl_count" : {
"cardinality" : {
"field" : "repositoryUrl",
"precision_threshold": 40000
}
}
}
}
'
```
Count distinct values of the `repositoryUrl` field for all the kustomize resource dirs in the index:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "resource" }},
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"repositoryUrl_count" : {
"cardinality" : {
"field" : "repositoryUrl",
"precision_threshold": 40000
}
}
}
}
'
```
Count distinct values of the `repositoryUrl` field for all the kustomize generator dirs in the index:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "generator" }},
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"repositoryUrl_count" : {
"cardinality" : {
"field" : "repositoryUrl",
"precision_threshold": 40000
}
}
}
}
'
```
Count distinct values of the `repositoryUrl` field for all the kustomize transformer dirs in the index:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "transformer" }},
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"repositoryUrl_count" : {
"cardinality" : {
"field" : "repositoryUrl",
"precision_threshold": 40000
}
}
}
}
'
```
List all the github repositories including kustomization files and kustomize resource files,
and how many kustomization files and kustomize resource files each github repository includes
(the github repository including the most kustomization files is listed first):
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"repositoryUrl" : {
"terms" : {
"field" : "repositoryUrl",
"size": 2082
}
}
}
}
'
```
List the top 20 Github repositories including the most amount of kustomization files:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"repositoryUrl" : {
"terms" : {
"field" : "repositoryUrl",
"size": 20
}
}
}
}
'
```
List the top 20 Github repositories including the most amount of kustomize resource files:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"filter": [
{ "regexp": { "fileType": "resource" }}
]
}
},
"aggs" : {
"repositoryUrl" : {
"terms" : {
"field" : "repositoryUrl",
"size": 20
}
}
}
}
'
```

View File

@@ -1,29 +0,0 @@
Retrieve information about all registered snapshot repositories:
```
curl -s -X GET "${ElasticSearchURL}:9200/_snapshot?pretty"
```
Retrieve information about a given snapshot repository, `kustomize-backup`:
```
curl -s -X GET "${ElasticSearchURL}:9200/_snapshot/kustomize-backup?pretty"
```
Verify a snapshot repository, `kustomize-backup`, manually:
```
curl -s -X POST "${ElasticSearchURL}:9200/_snapshot/kustomize-backup/_verify?pretty"
```
List all the snapshots in a given snapshot repository:
```
curl -s -X GET "${ElasticSearchURL}:9200/_cat/snapshots/kustomize-backup?v&s=id&pretty"
```
Retrieve a summary information about a given snapshot:
```
curl -s -X GET "${ElasticSearchURL}:9200/_snapshot/kustomize-backup/kustomize-snapshot?pretty"
```
Retrieve a detailed information about a given snapshot:
```
curl -s -X GET "${ElasticSearchURL}:9200/_snapshot/kustomize-backup/kustomize-snapshot/_status?pretty"
```

File diff suppressed because it is too large Load Diff

View File

@@ -1,148 +0,0 @@
Search for all the kustomize resource files including a Deployment object:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"match" : {
"kinds" : {
"query" : "Deployment"
}
}
}
}
'
```
Search for all the kustomize resource files including a Deployment object, but only
including the `kinds` field in the result:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"_source": {
"includes": ["kinds"]
},
"query": {
"match" : {
"kinds" : {
"query" : "Deployment"
}
}
}
}
'
```
Search for all the kustomize resource files including both a Deployment object and
a Service object:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"match" : {
"kinds" : {
"query" : "Deployment Service",
"operator" : "and"
}
}
}
}
'
```
Count the number of documents including Deployment and the number of documents
including Service:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs" : {
"messages" : {
"filters" : {
"filters" : {
"Deployment" : { "match" : { "kinds" : "Deployment" }},
"Service" : { "match" : { "kinds" : "Service" }}
}
}
}
}
}
'
```
Search for all the kustomization files involving CRDs:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 10000,
"query": {
"match" : {
"identifiers" : {
"query" : "crds"
}
}
}
}
'
```
Search for all the kustomization files defining configMapGenerator:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 10000,
"query": {
"match" : {
"identifiers" : {
"query" : "configMapGenerator"
}
}
}
}
'
```
Search for all the documents having a `kind` field:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "match" : { "identifiers" : { "query" : "kind" }}}
]
}
}
}
'
```
Search for all the kuostmization files having a `kind` field:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "filePath": ".*/kustomization((.yaml)?|(.yml)?)" }},
{ "match" : { "identifiers" : { "query" : "kind" }}}
]
}
}
}
'
```
Search for all the kustomization files defining the `generatorOptions:disableNameSuffixHash` feature:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"match" : {
"identifiers" : {
"query" : "generatorOptions:disableNameSuffixHash"
}
}
}
}
'
```

View File

@@ -1,29 +0,0 @@
Find all the transformer files whose `kinds` field includes `HelmValues`, and
only output certain fields of each document:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 200,
"_source": {
"includes": ["kinds", "repositoryUrl", "defaultBranch", "filePath"]
},
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "transformer" }}
],
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"must": {
"match" : {
"kinds" : {
"query" : "HelmValues"
}
}
}
}
}
}
'
```

View File

@@ -1,380 +0,0 @@
Find all the documents having the `user` field set:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"exists": {
"field": "user"
}
}
}
'
```
Find all the documents whose `user` field is not set:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 10000,
"query": {
"bool": {
"must_not": {
"exists": {
"field": "user"
}
}
}
}
}
'
```
Search for all the documents whose `user` field is `kubernetes-sigs`:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "user": "kubernetes-sigs" }}
]
}
}
}
'
```
Count distinct values of the `user` field:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"user_count" : {
"cardinality" : {
"field" : "user",
"precision_threshold": 40000
}
}
}
}
'
```
List all the values of the `user` field and the frequency of each value:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"user" : {
"terms" : {
"field" : "user",
"size" : 20
}
}
}
}
'
```
Count distinct values of the `user` field for all the kustomization files in the index:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"user_count" : {
"cardinality" : {
"field" : "user",
"precision_threshold": 40000
}
}
}
}
'
```
For all the kustomization files in the index, list all the values of the
`user` field and the frequency of each value:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"user" : {
"terms" : {
"field" : "user",
"size": 20
}
}
}
}
'
```
Count distinct values of the `user` field for all the kustomize resource files in the index:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"filter": [
{ "regexp": { "fileType": "resource" }}
]
}
},
"aggs" : {
"user_count" : {
"cardinality" : {
"field" : "user",
"precision_threshold": 40000
}
}
}
}
'
```
For all the kustomize resource files in the index, list all the values of the
`user` field and the frequency of each value:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"filter": [
{ "regexp": { "fileType": "resource" }}
]
}
},
"aggs" : {
"user" : {
"terms" : {
"field" : "user",
"size": 20
}
}
}
}
'
```
Count distinct values of the `user` field for all the kustomize generator files in the index:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"filter": [
{ "regexp": { "fileType": "generator" }}
]
}
},
"aggs" : {
"user_count" : {
"cardinality" : {
"field" : "user",
"precision_threshold": 40000
}
}
}
}
'
```
For all the kustomize generator files in the index, list all the values of the
`user` field and the frequency of each value:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"filter": [
{ "regexp": { "fileType": "generator" }}
]
}
},
"aggs" : {
"user" : {
"terms" : {
"field" : "user",
"size": 20
}
}
}
}
'
```
Count distinct values of the `user` field for all the kustomize transformer files in the index:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"filter": [
{ "regexp": { "fileType": "transformer" }}
]
}
},
"aggs" : {
"user_count" : {
"cardinality" : {
"field" : "user",
"precision_threshold": 40000
}
}
}
}
'
```
For all the kustomize transformer files in the index, list all the values of the
`user` field and the frequency of each value:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": {
"regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }
},
"filter": [
{ "regexp": { "fileType": "transformer" }}
]
}
},
"aggs" : {
"user" : {
"terms" : {
"field" : "user",
"size": 20
}
}
}
}
'
```
Count distinct values of the `user` field for all the kustomize generator dirs in the index:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "generator" }},
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"user_count" : {
"cardinality" : {
"field" : "user",
"precision_threshold": 40000
}
}
}
}
'
```
For all the kustomize generator dirs in the index, list all the values of the
`user` field and the frequency of each value:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "generator" }},
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"user" : {
"terms" : {
"field" : "user",
"size": 20
}
}
}
}
'
```
Count distinct values of the `user` field for all the kustomize transformer dirs in the index:
```
curl -s -X POST "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "transformer" }},
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"user_count" : {
"cardinality" : {
"field" : "user",
"precision_threshold": 40000
}
}
}
}
'
```
For all the kustomize transformer dirs in the index, list all the values of the
`user` field and the frequency of each value:
```
curl -s -X GET "${ElasticSearchURL}:9200/${INDEXNAME}/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": [
{ "regexp": { "fileType": "transformer" }},
{ "regexp": { "filePath": "(.*/)?kustomization((.yaml)?|(.yml)?)(/)*" }}
]
}
},
"aggs" : {
"user" : {
"terms" : {
"field" : "user"
}
}
}
}
'
```

View File

@@ -1,2 +0,0 @@
node_modules
dist

View File

@@ -1,2 +0,0 @@
node_modules
dist

View File

@@ -1,2 +0,0 @@
node_modules
dist

View File

@@ -1,16 +0,0 @@
FROM node:latest as builder
WORKDIR /app
COPY package.json package-lock.json /app/
RUN cd /app && npm set progress=false && npm install
COPY . /app
RUN cd /app && npm run build
FROM nginx:alpine
RUN rm -rf /usr/share/nginx/html/*
# todo(damienr74), put this in configmap.
COPY nginx.conf /etc/nginx/nginx.conf
COPY --from=builder /app/dist/kustomize-search/ /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

View File

@@ -1,25 +0,0 @@
There is a Dockerfile for building container images.
## Development server
Run `ng serve` for a dev server. Navigate to `http://localhost:4200/`. The app will automatically reload if you change any of the source files.
## Code scaffolding
Run `ng generate component component-name` to generate a new component. You can also use `ng generate directive|pipe|service|class|guard|interface|enum|module`.
## Build
Run `ng build` to build the project. The build artifacts will be stored in the `dist/` directory. Use the `--prod` flag for a production build.
## Running unit tests
Run `ng test` to execute the unit tests via [Karma](https://karma-runner.github.io).
## Running end-to-end tests
Run `ng e2e` to execute the end-to-end tests via [Protractor](http://www.protractortest.org/).
## Further help
To get more help on the Angular CLI use `ng help` or go check out the [Angular CLI README](https://github.com/angular/angular-cli/blob/master/README.md).

View File

@@ -1,123 +0,0 @@
{
"$schema": "./node_modules/@angular/cli/lib/config/schema.json",
"version": 1,
"newProjectRoot": "projects",
"projects": {
"kustomize-search": {
"projectType": "application",
"schematics": {},
"root": "",
"sourceRoot": "src",
"prefix": "app",
"architect": {
"build": {
"builder": "@angular-devkit/build-angular:browser",
"options": {
"outputPath": "dist/kustomize-search",
"index": "src/index.html",
"main": "src/main.ts",
"polyfills": "src/polyfills.ts",
"tsConfig": "tsconfig.app.json",
"aot": false,
"assets": [
"src/favicon.ico",
"src/assets"
],
"styles": [
"./node_modules/@angular/material/prebuilt-themes/deeppurple-amber.css",
"src/styles.css"
],
"scripts": []
},
"configurations": {
"production": {
"fileReplacements": [
{
"replace": "src/environments/environment.ts",
"with": "src/environments/environment.prod.ts"
}
],
"optimization": true,
"outputHashing": "all",
"sourceMap": false,
"extractCss": true,
"namedChunks": false,
"aot": true,
"extractLicenses": true,
"vendorChunk": false,
"buildOptimizer": true,
"budgets": [
{
"type": "initial",
"maximumWarning": "2mb",
"maximumError": "5mb"
}
]
}
}
},
"serve": {
"builder": "@angular-devkit/build-angular:dev-server",
"options": {
"browserTarget": "kustomize-search:build"
},
"configurations": {
"production": {
"browserTarget": "kustomize-search:build:production"
}
}
},
"extract-i18n": {
"builder": "@angular-devkit/build-angular:extract-i18n",
"options": {
"browserTarget": "kustomize-search:build"
}
},
"test": {
"builder": "@angular-devkit/build-angular:karma",
"options": {
"main": "src/test.ts",
"polyfills": "src/polyfills.ts",
"tsConfig": "tsconfig.spec.json",
"karmaConfig": "karma.conf.js",
"assets": [
"src/favicon.ico",
"src/assets"
],
"styles": [
"./node_modules/@angular/material/prebuilt-themes/deeppurple-amber.css",
"src/styles.css"
],
"scripts": []
}
},
"lint": {
"builder": "@angular-devkit/build-angular:tslint",
"options": {
"tsConfig": [
"tsconfig.app.json",
"tsconfig.spec.json",
"e2e/tsconfig.json"
],
"exclude": [
"**/node_modules/**"
]
}
},
"e2e": {
"builder": "@angular-devkit/build-angular:protractor",
"options": {
"protractorConfig": "e2e/protractor.conf.js",
"devServerTarget": "kustomize-search:serve"
},
"configurations": {
"production": {
"devServerTarget": "kustomize-search:serve:production"
}
}
}
}
}
},
"defaultProject": "kustomize-search"
}

View File

@@ -1,12 +0,0 @@
# This file is used by the build system to adjust CSS and JS output to support the specified browsers below.
# For additional information regarding the format and rule options, please see:
# https://github.com/browserslist/browserslist#queries
# You can see what browsers were selected by your queries by running:
# npx browserslist
> 0.5%
last 2 versions
Firefox ESR
not dead
not IE 9-11 # For IE 9-11 support, remove 'not'.

View File

@@ -1,5 +0,0 @@
steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/kustomize-search/frontend', '.']
images:
- 'gcr.io/kustomize-search/frontend'

View File

@@ -1,32 +0,0 @@
// @ts-check
// Protractor configuration file, see link for more information
// https://github.com/angular/protractor/blob/master/lib/config.ts
const { SpecReporter } = require('jasmine-spec-reporter');
/**
* @type { import("protractor").Config }
*/
exports.config = {
allScriptsTimeout: 11000,
specs: [
'./src/**/*.e2e-spec.ts'
],
capabilities: {
'browserName': 'chrome'
},
directConnect: true,
baseUrl: 'http://localhost:4200/',
framework: 'jasmine',
jasmineNodeOpts: {
showColors: true,
defaultTimeoutInterval: 30000,
print: function() {}
},
onPrepare() {
require('ts-node').register({
project: require('path').join(__dirname, './tsconfig.json')
});
jasmine.getEnv().addReporter(new SpecReporter({ spec: { displayStacktrace: true } }));
}
};

View File

@@ -1,23 +0,0 @@
import { AppPage } from './app.po';
import { browser, logging } from 'protractor';
describe('workspace-project App', () => {
let page: AppPage;
beforeEach(() => {
page = new AppPage();
});
it('should display welcome message', () => {
page.navigateTo();
expect(page.getTitleText()).toEqual('Welcome to kustomize-search!');
});
afterEach(async () => {
// Assert that there are no errors emitted from the browser
const logs = await browser.manage().logs().get(logging.Type.BROWSER);
expect(logs).not.toContain(jasmine.objectContaining({
level: logging.Level.SEVERE,
} as logging.Entry));
});
});

View File

@@ -1,11 +0,0 @@
import { browser, by, element } from 'protractor';
export class AppPage {
navigateTo() {
return browser.get(browser.baseUrl) as Promise<any>;
}
getTitleText() {
return element(by.css('app-root h1')).getText() as Promise<string>;
}
}

View File

@@ -1,13 +0,0 @@
{
"extends": "../tsconfig.json",
"compilerOptions": {
"outDir": "../out-tsc/e2e",
"module": "commonjs",
"target": "es5",
"types": [
"jasmine",
"jasminewd2",
"node"
]
}
}

View File

@@ -1,32 +0,0 @@
// Karma configuration file, see link for more information
// https://karma-runner.github.io/1.0/config/configuration-file.html
module.exports = function (config) {
config.set({
basePath: '',
frameworks: ['jasmine', '@angular-devkit/build-angular'],
plugins: [
require('karma-jasmine'),
require('karma-chrome-launcher'),
require('karma-jasmine-html-reporter'),
require('karma-coverage-istanbul-reporter'),
require('@angular-devkit/build-angular/plugins/karma')
],
client: {
clearContext: false // leave Jasmine Spec Runner output visible in browser
},
coverageIstanbulReporter: {
dir: require('path').join(__dirname, './coverage/kustomize-search'),
reports: ['html', 'lcovonly', 'text-summary'],
fixWebpackSourcePaths: true
},
reporters: ['progress', 'kjhtml'],
port: 9876,
colors: true,
logLevel: config.LOG_INFO,
autoWatch: true,
browsers: ['Chrome'],
singleRun: false,
restartOnFileChange: true
});
};

View File

@@ -1,25 +0,0 @@
worker_processes 1;
events {
worker_connections 1024;
}
http {
server {
listen 80;
server_name 0.0.0.0;
root /usr/share/nginx/html;
index index.html index.htm;
include /etc/nginx/mime.types;
gzip on;
gzip_min_length 1000;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain text/css application/json applications/javascript application/x-javascript text/javascript;
location / {
try_files $uri $uri/ /index.html;
}
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,55 +0,0 @@
{
"name": "kustomize-search",
"version": "0.0.0",
"scripts": {
"ng": "ng",
"start": "ng serve",
"build": "ng build --prod --aot",
"test": "ng test",
"lint": "ng lint",
"e2e": "ng e2e"
},
"private": true,
"dependencies": {
"@angular/animations": "^8.2.14",
"@angular/cdk": "~8.2.3",
"@angular/common": "^8.2.14",
"@angular/compiler": "^8.2.14",
"@angular/core": "^8.2.14",
"@angular/forms": "^8.2.14",
"@angular/http": "^7.2.15",
"@angular/material": "^8.2.3",
"@angular/platform-browser": "^8.2.14",
"@angular/platform-browser-dynamic": "^8.2.14",
"@angular/router": "^8.2.14",
"angular-google-charts": "^0.1.6",
"chart.js": "^2.9.3",
"core-js": "^3.5.0",
"hammerjs": "^2.0.8",
"rxjs": "~6.5.3",
"serialize-javascript": "^2.1.2",
"tslib": "^1.10.0",
"zone.js": "~0.10.2"
},
"devDependencies": {
"@angular-devkit/build-angular": "^0.803.20",
"@angular/cli": "^8.3.20",
"@angular/compiler-cli": "^8.2.14",
"@angular/language-service": "^8.2.14",
"@types/jasmine": "^3.5.0",
"@types/jasminewd2": "^2.0.8",
"@types/node": "~12.12.17",
"codelyzer": "^5.2.0",
"jasmine-core": "~3.5.0",
"jasmine-spec-reporter": "~4.2.1",
"karma": "~4.4.1",
"karma-chrome-launcher": "~3.1.0",
"karma-coverage-istanbul-reporter": "^2.1.1",
"karma-jasmine": "~2.0.1",
"karma-jasmine-html-reporter": "^1.4.2",
"protractor": "~5.4.2",
"ts-node": "~8.5.4",
"tslint": "~5.20.1",
"typescript": "~3.7.3"
}
}

View File

@@ -1,2 +0,0 @@
<h1>{{ title }}</h1>
<router-outlet></router-outlet>

View File

@@ -1,31 +0,0 @@
import { TestBed, async } from '@angular/core/testing';
import { AppComponent } from './app.component';
describe('AppComponent', () => {
beforeEach(async(() => {
TestBed.configureTestingModule({
declarations: [
AppComponent
],
}).compileComponents();
}));
it('should create the app', () => {
const fixture = TestBed.createComponent(AppComponent);
const app = fixture.debugElement.componentInstance;
expect(app).toBeTruthy();
});
it(`should have as title 'kustomize-search'`, () => {
const fixture = TestBed.createComponent(AppComponent);
const app = fixture.debugElement.componentInstance;
expect(app.title).toEqual('kustomize-search');
});
it('should render title in a h1 tag', () => {
const fixture = TestBed.createComponent(AppComponent);
fixture.detectChanges();
const compiled = fixture.debugElement.nativeElement;
expect(compiled.querySelector('h1').textContent).toContain('Welcome to kustomize-search!');
});
});

View File

@@ -1,10 +0,0 @@
import { Component } from '@angular/core';
@Component({
selector: 'app-root',
templateUrl: './app.component.html',
styleUrls: ['./app.component.css']
})
export class AppComponent {
title = 'k8s Search';
}

View File

@@ -1,58 +0,0 @@
import { BrowserModule } from '@angular/platform-browser';
import { Routes, RouterModule } from '@angular/router';
import { NgModule } from '@angular/core';
import { FormsModule } from '@angular/forms';
import { HttpClientModule } from '@angular/common/http';
import { MatExpansionModule } from '@angular/material/expansion';
import { MatInputModule } from '@angular/material/input';
import { MatListModule } from '@angular/material/list';
import { MatButtonModule } from '@angular/material/button';
import { AppComponent } from './app.component';
import { SearchComponent } from './search/search.component';
import { BrowserAnimationsModule } from '@angular/platform-browser/animations';
import { HistogramComponent } from './histogram/histogram.component';
import { TimeseriesComponent } from './timeseries/timeseries.component';
const appRoutes: Routes = [
{
path: 'search',
component: SearchComponent,
runGuardsAndResolvers: 'always'
},
// Always ridirect to the search endpoint for now.
{
path: '',
redirectTo: 'search',
pathMatch: 'full',
},
];
@NgModule({
declarations: [
AppComponent,
SearchComponent,
HistogramComponent,
TimeseriesComponent,
],
imports: [
BrowserModule,
BrowserAnimationsModule,
HttpClientModule,
MatExpansionModule,
MatInputModule,
MatListModule,
MatButtonModule,
FormsModule,
RouterModule.forRoot(
appRoutes,
{ onSameUrlNavigation: 'reload', }
)
],
providers: [
{provide: HttpClientModule}
],
bootstrap: [AppComponent]
})
export class AppModule {}

View File

@@ -1,41 +0,0 @@
export interface SearchResults {
hits: SearchResults.Hits;
aggregations?: SearchResults.Aggregations;
};
export namespace SearchResults {
export class Hits {
total: number;
hits: SearchResults.InnerHits[];
};
export class InnerHits {
id: string;
result: SearchResults.Result;
};
export class Result {
repositoryUrl: string;
filePath: string;
defaultBranch: string;
document: string;
creationTime: Date;
values: string;
kinds: string;
};
export interface Aggregations {
timeseries?: SearchResults.BucketAggregation;
kinds?: SearchResults.BucketAggregation;
};
export interface BucketAggregation {
otherResults?: number;
buckets: SearchResults.Bucket[];
};
export class Bucket {
key: string;
count: number;
};
};

View File

@@ -1 +0,0 @@
<div><canvas id="histogram">{{hist}}</canvas></div>

View File

@@ -1,25 +0,0 @@
import { async, ComponentFixture, TestBed } from '@angular/core/testing';
import { HistogramComponent } from './histogram.component';
describe('HistogramComponent', () => {
let component: HistogramComponent;
let fixture: ComponentFixture<HistogramComponent>;
beforeEach(async(() => {
TestBed.configureTestingModule({
declarations: [ HistogramComponent ]
})
.compileComponents();
}));
beforeEach(() => {
fixture = TestBed.createComponent(HistogramComponent);
component = fixture.componentInstance;
fixture.detectChanges();
});
it('should create', () => {
expect(component).toBeTruthy();
});
});

Some files were not shown because too many files have changed in this diff Show More