GithubHelp home page GithubHelp logo

personal-notes's Introduction

personal-notes's People

Contributors

y4h2 avatar

Watchers

 avatar

personal-notes's Issues

Golang embed

import (
  _ "embed"
)

//go:embed templates/es.remoteService.yaml
var esRemoteServiceYAML string

关于BDD的思考

BDD (Behaviour Driven Development)

BDD的最典型语句就是Given, When和Then

Python有个behave库
典型例子就是

from behave import *

@given('I am on home page')
def step_i_am_on_home_page(context):
    context.driver.get("<http://demo.magentocommerce.com/>")

@when('I search for {text}')
def step_i_search_for(context, text):
    search_field = context.driver.find_element_by_name("q")
    search_field.clear()

    # enter search keyword and submit
    search_field.send_keys(text)
    search_field.submit()

@then('I should see list of matching products in search results')
def step_i_should_see_list(context):
    products = context.driver.\\
        find_elements_by_xpath("//h2[@class='product-name']/a")
    # check count of products shown in results
    assert len(products) > 0

至于具体实现原理就是基于正则表达式匹配

这样给调试代码带来非常大的麻烦

个人理解是具体看BDD测试用例是谁来写

  • 如果是PM来写,那么可以用正则匹配,但是对于语句关键词一定要约定
  • 如果是开发自己写,通常的顺序应该是反的,应该先有代码后有这些辅助文字,这个时候可以直接用print类的函数输出即可

GCP Operations Suite on GKE

source

SLO

{
  "displayName": "99% - Distribution Cut - Calendar month",
  "goal": 0.99,
  "calendarPeriod": "MONTH",
  "serviceLevelIndicator": {
    "requestBased": {
      "distributionCut": {
        "distributionFilter": "metric.type=\"custom.googleapis.com/opencensus/grpc.io/client/roundtrip_latency\" resource.type=\"global\"",
        "range": {
          "min": -9007199254740991,
          "max": 100
        }
      }
    }
  }
}

GCP Workload Identity

Main use case: to access GCP services, avoid exposing service account key.

Bind IAM service account with Kubernetes service account

Kubernetes Components

image

Control Plane Components

  • kube-apiserver: expose kubernetes API
  • etcd: store all cluster data
  • kube-scheduler: watch for new Pods without node, assign node for them
  • kube-controller-manager: multiple contorller
  • cloud-controller-manager: link to cloud provider's API

Node Components:

  • kubelet: agent on each node, make sure container is running in a Pod
  • kube-proxy: network proxy
  • container runtime

Kubernetes Components

Helm

helm chart最基本的功能是模板

更高级一点的功能是管理相关的dependency和test

GCP IAM role map to Kubernetes RBAC

Authorize actions in clusters using role-based access control
Kubernetes的Role支持四种binding (https://cloud.google.com/kubernetes-engine/docs/how-to/role-based-access-control#rolebinding)

  • GCP User email address
  • Kubernetes ServiceAccount
  • IAM service account
  • Google Group address

最小的权限是container.clusters.get

In almost all cases, Kubernetes RBAC can be used instead of IAM. GKE users require at minimum, the container.clusters.get IAM permission in the project that contains the cluster. This permission is included in the container.clusterViewer role, and in other more highly privileged roles. The container.clusters.get permission is required for users to authenticate to the clusters in the project, but does not authorize them to perform any actions inside those clusters. Authorization may then be provided by either IAM or Kubernetes RBAC.

例子:

kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: pod-reader-binding
  namespace: accounting
subjects:
# Google Cloud user account
- kind: User
  name: [email protected]
# Kubernetes service account
- kind: ServiceAccount
  name: johndoe
# IAM service account
- kind: User
  name: [email protected]
# Google Group
- kind: Group
  name: [email protected]
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

现在的主要问题是怎么让Role能binding到IAM的role group中

Kubernetes ServiceAccount

在Kubernetes中,ServiceAccount主要用于给Pod提供权限。

ServiceAccount可以和role通过RoleBinding绑定在一起

Kubernetes Ingress

image

ingress expose service to external

  • acts as an entry point
  • define routing rules

example

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
spec:
  rules:
  - host: example.com
    http:
      paths:
      - path: /foo
        pathType: Prefix
        backend:
          service:
            name: foo-service
            port:
              number: 3000
      - path: /bar
        pathType: Prefix
        backend:
          service:
            name: bar-service
            port:
              number: 6000
  - host: foo.example.com
    http:
      paths:
      - pathType: Prefix
        path: "/foo"
        backend:
          service:
            name: foo-service-2
            port:
              number: 80
  - host: "*.foo.example.com"
    http:
      paths:
      - pathType: Prefix
        path: "/foo"
        backend:
          service:
            name: foo-service-3
            port:
              number: 8080

Kubernetes pod affinity, taint & toleration

  • affinity: assign pod to specific node
  • anti-affinity: not assign pod to specific node
  • taint & toleration: they are used together to avoid/allow pods to land on specific nodes. taint for nodes, toleration for pod. Let us imagine rooms as nodes, pods as people, the locks of the room as taints, and keys to those locks as toleration.

Kubernetes Service Type

  • ClusterIP: for internal access
  • NodePort: add port to node IP
  • LoadBalancer: based on cloud provider
  • ExternalName: map internal service to external DNS. mainly used as an alias for external service.

Terraform

Current course:
KodeKloud: Terraform Basics Traning Course

理解resource,datasource和variable:

  • resource主要用来创建管理资源
  • datasource用于获取动态资源(资源可以已经存在)
  • variable用于管理静态资源,如果不提供默认值,会在apply的时候提示

useful providers:

SonarQube

Run SonarQube

from docker

docker run -d --name sonarqube -e SONAR_ES_BOOTSTRAP_CHECKS_DISABLE=true -p 9000:9000 sonarqube:latest

SonarScanner

Download Page

Running from docker, the docker network should connect SonarQube server's network

docker run \
    --rm \
    -e SONAR_HOST_URL="http://${SONARQUBE_URL}" \
    -e SONAR_SCANNER_OPTS="-Dsonar.projectKey=${YOUR_PROJECT_KEY}"
    -e SONAR_LOGIN="myAuthenticationToken" \
    -v "${YOUR_REPO}:/usr/src" \
    sonarsource/sonar-scanner-cli

Heredoc Strings

pass multiple lines.

<<DELIMITER
hello
world
DELIMITER

the delimiter can be any string. Most people use EOF as the delimiter

关于内网穿透的思考

需求场景

  • 我们通过Webhook向发送notification
  • 用户的server被隔离在VPC内部
  • 需要一个tunnel, 让我们的webhook请求能够顺利到达用户的server

当时的方案

构造了一个WebSocket server, 一端连在我们的server上,websocket client端需要部署在VPC内部。
两端之间用WebSocket连接,把原本的的单向request,变成了一个相当复杂的双向交互式系统。

  • 首先VPC内部的websocket client端,需要尝试主动连接我们的websocket server端。(当然这里还需要实现认证, 在部署client的时候需要设置密钥信息)
  • client端连上server端之后,就等待server端的消息。server端会维护一个clIent的注册信息,映射client id和connection ID
  • server端在需要发送请求的时候,就找到对应的connection发送请求

这里再说说这个方案存在的问题

  • 部署(更新)困难:当发现代码有问题,需要更新client端代码时,非常的复杂。首先服务器不受我们团队控制, 每次更新并没有CI/CD可以直接使用, 还要经过CAB之类的approval。而且那套部署流程,我至今都还没有完全掌握(极度传统ops)。
  • 代码逻辑复杂:
    • 把原本一个request就能解决的问题,变成了多步处理,这样每一步都会增加出错的概率。
    • 从一个请求变成了维护长连接, golang的websocket本身不够稳定,现有的package都非常偏底层,需要自己实现处理连接超时和重连的问题。整个处理逻辑无比复杂,还不能保证一定正确。
  • 代码逻辑复杂带来的直接问题就是维护成本高
  • 资源使用率问题,为了保证server端随时能够找到client端,所以必须保持长连接,一旦连接中断,client必须马上重连,但是实际使用中发请求占用的时间比率相当之低,空闲率甚至能高于90%~95%。
  • scale问题,因为server端要一直保证长连接,所以server的压力会非常大。

事后的思考

一个从根源上解决问题的方法就是配置防火墙,让我们server的ip能够通过VPC.

其次,可以考虑pub/sub的方案来处理,即客户端subscribe SQS,收到消息后立即发送webhook request

其他方案比如gRPC的streaming也是可行的

心路历程

算是从头到尾参与的一个烂项目,项目开展之前没有足够的时间做调研,等到发现架构有问题的时候,leader却已经在高层面前做完了demo, 开始着手实现细节了。等到想要阻止的时候,已经箭在弦上必须要上线了。
还有一些问题是开始没有想到的,比如connection频繁中断,这个在测试之后就暴露了。

NAT穿透的主要原理:由内网主动找外网建立连接,建立连接之后,外网可以向内网发消息

另:通过pub/sub模式可以避开NAT穿透

Tracing

Opentelemetry

主要概念:

  • Tracing
  • Metrics
  • Span
  • Baggage: context只会携带trace id和span id,如果需要添加其他的信息则需要使用Baggage。 同时span tag只在当前span内有效,不会被子span继承, 所以如果有需要传递的部分则要使用Baggage

官方文档

OpenTelemetry 快速入门 : 主要是示例代码

uptrace tutorial

OpenTelemetry-Go Contrib

traceID和span id包含在context中, 通过传递context来保持连接关系

设置global tracer

一定要设置SetTextMapPropagator, 否则不能跨进程trace

import (
	"go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/exporters/jaeger"
	"go.opentelemetry.io/otel/propagation"
	"go.opentelemetry.io/otel/sdk/resource"
	tracesdk "go.opentelemetry.io/otel/sdk/trace"
	semconv "go.opentelemetry.io/otel/semconv/v1.4.0"
)

func SetGlobalTracer(serviceName string, exporterAddress string, exporterPort string) error {
	exporter, err := jaeger.New(jaeger.WithAgentEndpoint(
		jaeger.WithAgentHost(exporterAddress),
		jaeger.WithAgentPort(exporterPort),
	))

	if err != nil {
		return err
	}

	tp := tracesdk.NewTracerProvider(
		tracesdk.WithBatcher(exporter),
		tracesdk.WithResource(resource.NewWithAttributes(
			semconv.SchemaURL,
			semconv.ServiceNameKey.String(serviceName),
		)),
	)

	otel.SetTracerProvider(tp)
	otel.SetTextMapPropagator(propagation.NewCompositeTextMapPropagator(propagation.TraceContext{}))
	return nil
}

Span Attributes

给span添加attributes, 在jaeger里面就是tag

tr := otel.Tracer("component-http")
ctx, span := tr.Start(ctx, "http", race.WithAttributes(attribute.Key("attr-http").String("hello http")))
span.SetAttributes(attribute.Key("http.method").String("GET"))
span.SetAttributes(attribute.Key("http.url").String(url))

Span Event

span.AddEvent("Init");
span.AddEvent("End");

嵌套Span

func parentFunction(ctx context.Context) {
    tracer := otel.Tracer("component-parent")
    ctx, span := tracer.Start(ctx, "parent")
    defer span.End()
    // call our child function
    childFunction(ctx)
}

func childFunction(ctx context.Context) {
    tracer := otel.Tracer("component-child")
    ctx, span := tracer.Start(ctx, "child")
    defer span.End()
    childFunction2(ctx)
}

error handling

https://opentelemetry.io/docs/instrumentation/go/getting-started/#bonus-errors

import (
        "go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/attribute"
	"go.opentelemetry.io/otel/codes"
	"go.opentelemetry.io/otel/trace"
)


span.RecordError(err)
span.SetStatus(codes.Error, err.Error())

https://segmentfault.com/a/1190000042031697

Go - Step by step guide for implementing tracing on a microservices architecture

Setting up open telemetry for golang on google cloud platform

Log

日志工具

日志分类

诊断日志(用于Debug), 统计日志(用于用户计费), 审计日志

日志中记录什么

不多

  1. 能放一条日志里的东西, 放在多条日志中输出
  2. 预期会发生且能够被正常处理的异常,打印出一堆无用的堆栈
  3. 开发人员在开发过程中为了调试方便而加入的“临时”日志

不少

  1. 请求出错时不能通过日志直接来定位问题,而需要开发人员再临时增加日志并要求请求的发送者重新发送同样的请求才能定位问题;
  2. 无法确定服务中的后台任务是否按照期望执行
  3. 无法确定服务的内存数据结构的状态;
  4. 无法确定服务的异常处理逻辑(如重试)是否正确执行;
  5. 无法确定服务启动时配置是否正确加载;

容易遗漏的点

  1. 系统的配置参数
  2. 后台定期执行的任务, 可以记录任务的开始结束时间, 更新的内容
  3. 异常处理逻辑
  4. 日志中需要关键参数, 出错时的关键原因

日志的级别:

TRACE, DEBUG, INFO , WARN, ERROR, FATAL
一个项目的日志级别需要被所有人共同遵守

不断优化日志

  • 在定位问题中完善日志
  • 定期对日志内容进行Review

关于RequestID

对于简单系统, 可以简单采用一个随机数

对于复杂系统, 可以将处理请求的服务器IP,接收到请求的时间等信息编码到RequestID中

动态日志输出

INFO级别的日志通常是用于记录常规的系统运行状态, 请求的基本输入输出.

DEBUG则详细记录了一个请求的处理过程, 甚至是每一个函数的输入和输出结果, 遇到一些隐藏比较深得问题, 需要依赖DEBUG日志.

DEBUG日志一般比INFO要多一个数量级

我们希望动态开关DEBUG日志

通常方案

业务层: 收到请求包含DEBUG=ON的请求, 则把相关的DEBUG级别日志输出

LB level: 在负载均衡层的Openresty中,实现如下接口:管理员可以配置将哪个用户的哪个桶的哪个对象的哪种操作(注:这是对象存储中的几个概念)输出为DEBUG日志,Openresty会对每个请求进行过滤,当发现请求和配置的DEBUG日志输出条件相匹配时,则在请求的QueryString中新增"DEBUG=ON"参数。

慢操作日志

服务在接收到一个请求的时候,记录请求的接收时间(T1),在请求处理完成待发送的时候,会记录请求发送时间(T2),通常一个请求的日志都记为INFO级别,然而当出现请求处理时间(T2-T1)超过一定时间(如10s)时,可以将该日志提升为WARN级别。通过该方法,可以预先发现系统可能存在的一些问题。

日志监控

通过对日志中的关键字进行监控,可以及时发现系统故障并报警,这对于保证服务的SLA至关重要。

其他

上线后的日志观察

日志输出到不同的文件

日志文件的大小

  • log rotate
  • 定期删除
  • 收集日志文件

MongoDB Optimization

  • enable profiling to find slow queries.
  • use explain to see whether slow queries hit index
  • add index

Kubernetes Monitoring | Alerts

Kubernetes Alerting | Best Practices in 2022

  • How many resources your entire cluster is using
  • How many nodes are present and how many apps are on each node
  • The amount of memory used
  • Network bandwidth

Deployments and Pods:

  • Crashloop

Application

  • Latency
  • Requests Per Second
  • Responsiveness
  • Uptime
  • reaction times

Disk Usage Warning
Network Connectivity Issues
Pods that aren't working
Node resource Consumption
Missing pods
Container restarts

GCP managed Promethues

参考: Measure your golden signals with GKE Managed Prometheus and the nginx-ingress

部署测试组件

helm upgrade --install ingress-nginx ingress-nginx \
  --repo https://[kubernetes.github.io/ingress-nginx](http://kubernetes.github.io/ingress-nginx) \
  --namespace ingress-nginx --create-namespace \
  --set controller.metrics.enabled=true

部署PodMonitoring

apiVersion: monitoring.googleapis.com/v1
kind: PodMonitoring
metadata:
  name: ingress-nginx-metrics-monitoring
  namespace: ingress-nginx
spec:
  endpoints:
  - interval: 5s
    port: 10254
  selector:
    matchLabels:
      app.kubernetes.io/name): ingress-nginx

然后在GCP monitoring page上能看到对应的metrics
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.