zalando-incubator / catwatch Goto Github PK
View Code? Open in Web Editor NEWA metrics dashboard for GitHub organizations, with results accessible via REST API
Home Page: https://zalando.github.io/
License: Other
A metrics dashboard for GitHub organizations, with results accessible via REST API
Home Page: https://zalando.github.io/
License: Other
On the front/home page, the name of the TechMonkeys project "Gin-OAuth2" is showing up with additional running text in its respective card. Other projects are listed by name-only.
#66 introduced the ability to disable the regular in-process fetcher in order to run multiple replicas of catwatch. This was done via scoping it into a profile so that the user could decide whether to run it via setting or omitting this profile.
See https://github.bus.zalan.do/stups/stups-deploy/pull/285/files#diff-bfcd49cb9b1efcabc1006c3528eb334e for a possible configuration.
However, looking at the logs of the Kubernetes deployment, I can see that with the above configuration the in-process task fetcher is still running.
$ kubectl logs -f catwatch-master-11-14-232796100-8zpw1
08:01:00.001 [pool-3-thread-1] INFO o.z.c.backend.scheduler.Fetcher - Starting fetching data. Snapshot date: Mon Jun 26 08:01:00 UTC 2017 1498464060000, IP and MAC Address: 10.2.62.159#0A-58-0A-02-3E-9F.
08:01:00.001 [pool-3-thread-1] INFO o.z.c.backend.scheduler.Fetcher - Enqueued task TakeSnapshotTask for organization 'zalando'.
08:01:00.001 [pool-2-thread-7] INFO o.z.c.b.github.TakeSnapshotTask - Taking snapshot of organization 'zalando'.
08:01:00.001 [pool-3-thread-1] INFO o.z.c.backend.scheduler.Fetcher - Enqueued task TakeSnapshotTask for organization 'zalando-stups'.
08:01:00.002 [pool-3-thread-1] INFO o.z.c.backend.scheduler.Fetcher - Enqueued task TakeSnapshotTask for organization 'zalando-incubator'.
08:01:00.002 [pool-2-thread-8] INFO o.z.c.b.github.TakeSnapshotTask - Taking snapshot of organization 'zalando-stups'.
08:01:00.002 [pool-3-thread-1] INFO o.z.c.backend.scheduler.Fetcher - Submitted 3 TakeSnapshotTasks.
08:01:00.002 [pool-2-thread-9] INFO o.z.c.b.github.TakeSnapshotTask - Taking snapshot of organization 'zalando-incubator'.
08:01:01.688 [pool-2-thread-8] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting statistics for organization 'zalando-stups'.
08:01:01.905 [pool-2-thread-7] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting statistics for organization 'zalando'.
08:01:03.309 [pool-2-thread-9] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting statistics for organization 'zalando-incubator'.
08:01:03.774 [pool-2-thread-9] WARN o.z.c.b.github.OrganizationWrapper - No teams found for organization 'zalando-incubator'.
08:01:18.903 [pool-2-thread-8] INFO o.z.c.b.github.TakeSnapshotTask - Finished collecting statistics for organization 'zalando-stups'.
08:01:18.903 [pool-2-thread-8] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting projects for organization 'zalando-stups'.
08:01:28.288 [pool-2-thread-7] INFO o.z.c.b.github.TakeSnapshotTask - Finished collecting statistics for organization 'zalando'.
08:01:28.288 [pool-2-thread-7] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting projects for organization 'zalando'.
08:02:02.044 [pool-2-thread-9] INFO o.z.c.b.github.TakeSnapshotTask - Finished collecting statistics for organization 'zalando-incubator'.
08:02:02.044 [pool-2-thread-9] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting projects for organization 'zalando-incubator'.
08:02:46.878 [pool-2-thread-8] INFO o.z.c.b.github.TakeSnapshotTask - Finished collecting projects for organization 'zalando-stups'.
08:02:46.878 [pool-2-thread-8] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting contributors for organization 'zalando-stups'.
09:01:06.548 [pool-2-thread-8] INFO o.z.c.b.github.TakeSnapshotTask - Finished collecting contributors for organization 'zalando-stups'.
09:01:06.549 [pool-2-thread-8] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting languages for organization 'zalando-stups'.
09:01:12.755 [pool-2-thread-8] INFO o.z.c.b.github.TakeSnapshotTask - Finished collecting languages for organization 'zalando-stups'.
09:01:12.755 [pool-2-thread-8] INFO o.z.c.b.github.TakeSnapshotTask - Successfully taken snapshot of organization 'zalando-stups'.
09:02:16.736 [pool-2-thread-7] INFO o.z.c.b.github.TakeSnapshotTask - Finished collecting projects for organization 'zalando'.
09:02:16.736 [pool-2-thread-7] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting contributors for organization 'zalando'.
09:03:11.951 [pool-2-thread-7] INFO o.z.c.b.github.TakeSnapshotTask - Finished collecting contributors for organization 'zalando'.
09:03:11.951 [pool-2-thread-7] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting languages for organization 'zalando'.
09:03:21.399 [pool-2-thread-7] INFO o.z.c.b.github.TakeSnapshotTask - Finished collecting languages for organization 'zalando'.
09:03:21.399 [pool-2-thread-7] INFO o.z.c.b.github.TakeSnapshotTask - Successfully taken snapshot of organization 'zalando'.
09:03:23.011 [pool-3-thread-1] INFO o.z.c.backend.scheduler.Fetcher - Successfully saved data for organization 'zalando'.
09:03:23.612 [pool-3-thread-1] INFO o.z.c.backend.scheduler.Fetcher - Successfully saved data for organization 'zalando-stups'.
$ kubectl get pods catwatch-master-11-14-232796100-8zpw1 -o json | jq '.spec.containers[].env[] | select(.name == "SPRING_PROFILES_ACTIVE")'
{
"name": "SPRING_PROFILES_ACTIVE",
"value": "postgresql,k8s"
}
/cc @jbellmann
We should apply score penalties to projects with less than 2 maintainers. This involves changing the "score" function (https://github.com/zalando/catwatch/blob/master/catwatch-backend/src/main/resources/application.properties#L22).
Proposal for the score function:
function(project) {
var penalty = 0;
if (project.maintainers.length < 2) {
penalty = 100;
}
return project.forksCount > 0 ? ( project.starsCount + project.forksCount + project.contributorsCount + project.commitsCount / 100 - penalty) : 0;
}
There are a bunch of unsecured endpoints in the AdminController, one takes in Javascript.
It would be awesome to get the maintainers (via REST API) of all repositories. Example MAINTAINERS file: https://github.com/zalando-stups/fullstop/blob/master/MAINTAINERS
This information could be displayed but also used to check for non-maintained projects and to do custom scripting ("give me all projects I'm maintaining").
Hi there,
If you search the Golang projects you get BP x 2:. Can you take a look?
Interesting metrics would be:
Not sure if this is possible with the github APIs though.
It is possibly due to hibernate schema discovery
17:06:05.214 [main] INFO org.hibernate.Version - HHH000412: Hibernate Core {4.3.10.Final}
17:06:05.219 [main] INFO org.hibernate.cfg.Environment - HHH000206: hibernate.properties not found
17:06:05.221 [main] INFO org.hibernate.cfg.Environment - HHH000021: Bytecode provider name : javassist
17:06:05.658 [main] INFO o.h.annotations.common.Version - HCANN000001: Hibernate Commons Annotations {4.0.5.Final}
17:06:31.766 [main] INFO org.hibernate.dialect.Dialect - HHH000400: Using dialect: org.hibernate.dialect.PostgreSQLDialect
"opensource" is the wrong team id, it should be "opensourceguild"
http://projects.spring.io/spring-boot/
Current stable version is 1.3.2
Proposal: Catwatch should automatically read a .catwatch.yaml
file in the root of the repository if it's there. This YAML file allows defining a human readable title and a project image (logo):
Example:
title: ZMON Controller
image: https://demo.zmon.io/logo.png
Special symbols like slashes cause exceptions when parsing .catwatch.yaml:
com.fasterxml.jackson.dataformat.yaml.snakeyaml.reader.ReaderException: special characters are not allowed
Catwatch api https://catwatch.opensource.zalan.do/projects returns image: null
for every project, even for the ones that have image in .catwatch.yaml
file, e.g. connexion
When calling /contributors
endpoint the contributors returned always have null organization
property, even though it is non null in database.
When I go to http://zalando.github.io/zalando.github.io-dev/ via a browser in a mobile phone
then all numbers are zeros as well as no repositories are shown.
Solution: Provide https://catwatch-web.hackweek.zalan.do with a proper TLS certificate (not a self-signed one).
The graphs (http://zalando.github.io/#graphs) currently break the layout and do not look great --- maybe we can hide them until they are "polished"?
18:33:25.092 [pool-3-thread-1] INFO o.z.c.b.github.TakeSnapshotTask - Taking snapshot of organization 'zalando-stups'.
18:33:25.093 [http-nio-8080-exec-1] INFO o.z.c.backend.scheduler.Fetcher - Submitted 1 TakeSnapshotTasks.
18:33:27.189 [pool-3-thread-1] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting statistics for organization 'zalando-stups'.
18:33:35.223 [pool-3-thread-1] WARN o.z.c.b.github.RepositoryWrapper - No contributors found for project 'stups-feedback' of organization 'zalando-stups'.
18:33:36.829 [pool-3-thread-1] WARN o.z.c.b.github.RepositoryWrapper - No contributors found for project 'costreport' of organization 'zalando-stups'.
18:33:46.474 [pool-3-thread-1] INFO o.z.c.b.github.TakeSnapshotTask - Finished collecting statistics for organization 'zalando-stups'.
18:33:46.475 [pool-3-thread-1] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting projects for organization 'zalando-stups'.
18:35:01.580 [pool-3-thread-1] WARN o.z.c.b.github.RepositoryWrapper - No commits found for project 'stups-feedback' of organization 'zalando-stups'.
18:35:01.833 [pool-3-thread-1] WARN o.z.c.b.github.RepositoryWrapper - No contributors found for project 'stups-feedback' of organization 'zalando-stups'.
18:35:12.587 [pool-3-thread-1] WARN o.z.c.b.github.RepositoryWrapper - No commits found for project 'costreport' of organization 'zalando-stups'.
18:35:12.721 [pool-3-thread-1] WARN o.z.c.b.github.RepositoryWrapper - No contributors found for project 'costreport' of organization 'zalando-stups'.
18:35:12.881 [pool-3-thread-1] INFO o.z.c.b.github.TakeSnapshotTask - Finished collecting projects for organization 'zalando-stups'.
18:35:12.881 [pool-3-thread-1] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting contributors for organization 'zalando-stups'.
18:35:16.024 [pool-3-thread-1] WARN o.z.c.b.github.RepositoryWrapper - No contributors found for project 'stups-feedback' of organization 'zalando-stups'.
18:35:16.160 [pool-3-thread-1] WARN o.z.c.b.github.RepositoryWrapper - No contributors found for project 'costreport' of organization 'zalando-stups'.
18:35:25.062 [pool-3-thread-1] INFO o.z.c.b.github.TakeSnapshotTask - Finished collecting contributors for organization 'zalando-stups'.
18:35:25.062 [pool-3-thread-1] INFO o.z.c.b.github.TakeSnapshotTask - Started collecting languages for organization 'zalando-stups'.
18:35:27.504 [pool-3-thread-1] INFO o.z.c.b.github.TakeSnapshotTask - Finished collecting languages for organization 'zalando-stups'.
18:35:27.504 [pool-3-thread-1] INFO o.z.c.b.github.TakeSnapshotTask - Successfully taken snapshot of organization 'zalando-stups'.
18:35:27.853 [http-nio-8080-exec-1] INFO o.z.c.backend.scheduler.Fetcher - Successfully saved data for organization 'zalando-stups'.
18:35:27.853 [http-nio-8080-exec-1] INFO o.z.c.backend.scheduler.Fetcher - Finished fetching data.
I'm currently in the process of migrating https://zalando.github.io/ to our Kubernetes setup.
I intend to run at least two replicas (containers) of Catwatch in order to ensure it's available during cluster updates. However, as far as I can see, that would also schedule two concurrent fetcher tasks [1].
My question is: Do I have to fear any undefined behaviour, corrupt data, etc. from running mutiple fetchers or is it safe?
The current code base fetches not all repositories (GHOrganization.listRepositories() returns only the first page).
When I go to http://zalando.github.io/zalando.github.io-dev/ via a browser in a mobile phone
then all repositories are listed twice.
The Catwatch API should follow the Zalando REST API guidelines, including:
Currently the "/projects" endpoints (to name one) returns camelCase JSON properties.
During scheduled job there are some exceptions:
Seems to fetch data for 'zalando-techmonkeys' as configured in application.properties.
Does this organization exists? I'm unable to find it.
Hey, can we get rid of those @rbobin? :)
I personally vote for maven as for a more standard and wide spread system
Forked repositories are currently included in the list of "Zalando's Open Source Projects" (https://zalando.github.io/#repositories), e.g. the "docker-maven-plugin" (https://github.com/zalando-stups/docker-maven-plugin) is not a Zalando-owned project.
We should exclude all forked repositories from CatWatch as they skew the view on what are Zalando's Open Source projects.
Does the Catwatch formula used to calculate rankings factor in the number of commits? I would advise dropping this from the formula because it doesn't necessarily indicate better project quality. Projects like Zappr that are really useful aren't showing up near the top of the rankings -- would prefer that they did.
If the project get's data from github why not cache in memory or using in memory db instead of having PostgreSQL dependency ? AFAIK without DB this project will have higher chances of adaptability.
Zalando Research focuses on publishing open-source machine learning libraries.
Organization url: https://github.com/zalandoresearch/
Could I request some screenshots or a live demo? The project sounds interesting and I want to learn more before I have the time to run through the setup. :)
Benefit: Dockerfile must not be adjusted to pass configuration parameters, i.e. something like the following can be avoided.
CMD java -jar /catwatch-backend.jar -Dspring.database.driverClassName=${SPRING_DATASOURCE_DRIVERCLASSNAME} -Dspring.jpa.hibernate.ddl-auto=${SPRING_JPA_HIBERNATE_DDL_AUTO}
It will result in error, but I don't exactly which column as I change some of them to text.
This should probably be the cause project from Zalando repo not displayed.
When I go to http://zalando.github.io/zalando.github.io-dev/
then an AWS instance of the the temporary Hackweek account is used as backend.
https://catwatch-web.hackweek.zalan.do
But the backend should run on an AWS instance of a permanent account, should be monitored, updated regularly.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.