반응형

ArgoCD 사용 방법 완전 가이드

1. ArgoCD 설치 및 초기 설정

1.1 ArgoCD 설치

# ArgoCD 네임스페이스 생성
kubectl create namespace argocd

# ArgoCD 설치
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

# 설치 확인
kubectl get pods -n argocd

1.2 ArgoCD CLI 설치

# Linux/macOS
curl -sSL -o argocd-linux-amd64 https://github.com/argoproj/argo-cd/releases/latest/download/argocd-linux-amd64
sudo install -m 555 argocd-linux-amd64 /usr/local/bin/argocd

# 또는 Homebrew (macOS)
brew install argocd

1.3 초기 접근 설정

# ArgoCD Server를 외부에서 접근 가능하도록 설정 (개발환경)
kubectl patch svc argocd-server -n argocd -p '{"spec":{"type":"LoadBalancer"}}'

# 또는 포트 포워딩 사용
kubectl port-forward svc/argocd-server -n argocd 8080:443

# 초기 admin 패스워드 확인
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d

1.4 CLI 로그인

# 로그인
argocd login localhost:8080

# 패스워드 변경
argocd account update-password

2. Repository 등록

2.1 Public Repository 등록

# CLI로 등록
argocd repo add https://github.com/your-org/k8s-manifests

# YAML로 등록
kubectl apply -f - <https://github.com/your-org/k8s-manifests
EOF

2.2 Private Repository 등록

# SSH 키 방식
argocd repo add git@github.com:your-org/private-repo.git \
  --ssh-private-key-path ~/.ssh/id_rsa

# HTTPS 토큰 방식
argocd repo add https://github.com/your-org/private-repo.git \
  --username your-username \
  --password ghp_your-token

2.3 Repository 등록 확인

# 등록된 저장소 목록 확인
argocd repo list

# 연결 테스트
argocd repo get https://github.com/your-org/k8s-manifests

3. Project 생성 및 관리

3.1 기본 Project 생성

# CLI로 Project 생성
argocd proj create data-engineering \
  --description "데이터 엔지니어링 팀 프로젝트" \
  --src "https://github.com/your-org/*" \
  --dest "https://kubernetes.default.svc,data-*"

3.2 YAML로 Project 정의

# data-engineering-project.yaml
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
  name: data-engineering
  namespace: argocd
spec:
  description: "데이터 엔지니어링 팀 프로젝트"
  
  # 허용된 소스 저장소
  sourceRepos:
  - 'https://github.com/your-org/data-infrastructure'
  - 'https://github.com/your-org/airflow-dags'
  - 'https://helm-charts.bitnami.com/*'
  
  # 배포 가능한 대상
  destinations:
  - namespace: 'data-*'
    server: https://kubernetes.default.svc
  - namespace: airflow
    server: https://kubernetes.default.svc
  - namespace: kafka
    server: https://kubernetes.default.svc
    
  # 클러스터 리소스 권한
  clusterResourceWhitelist:
  - group: ''
    kind: PersistentVolume
  - group: 'storage.k8s.io'
    kind: StorageClass
    
  # 네임스페이스 리소스 권한
  namespaceResourceWhitelist:
  - group: 'apps'
    kind: Deployment
  - group: 'apps'
    kind: StatefulSet  
  - group: 'batch'
    kind: Job
  - group: 'batch'
    kind: CronJob
  - group: ''
    kind: Service
  - group: ''
    kind: ConfigMap
  - group: ''
    kind: Secret
    
  # 거부할 리소스
  namespaceResourceBlacklist:
  - group: ''
    kind: ResourceQuota
    
  # 역할 기반 접근 제어
  roles:
  - name: data-engineer
    description: 데이터 엔지니어 역할
    policies:
    - p, proj:data-engineering:data-engineer, applications, *, data-engineering/*, allow
    - p, proj:data-engineering:data-engineer, applications, sync, data-engineering/*, allow
    groups:
    - data-engineering-team
    
  - name: data-engineer-readonly
    description: 읽기 전용 접근
    policies:
    - p, proj:data-engineering:data-engineer-readonly, applications, get, data-engineering/*, allow
    groups:
    - data-engineering-viewers
# Project 적용
kubectl apply -f data-engineering-project.yaml

# Project 확인
argocd proj get data-engineering

4. Application 생성 및 배포

4.1 간단한 Application 생성

# CLI로 Application 생성
argocd app create my-app \
  --repo https://github.com/your-org/k8s-manifests \
  --path manifests/my-app \
  --dest-server https://kubernetes.default.svc \
  --dest-namespace default \
  --project data-engineering

4.2 Helm Chart Application

# airflow-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: airflow-production
  namespace: argocd
  finalizers:
  - resources-finalizer.argocd.argoproj.io
spec:
  project: data-engineering
  
  source:
    repoURL: https://airflow.apache.org/
    chart: airflow
    targetRevision: 1.7.0
    helm:
      values: |
        # Airflow 설정
        webserver:
          service:
            type: LoadBalancer
        
        # 워커 설정
        workers:
          replicas: 3
          resources:
            requests:
              memory: "2Gi"
              cpu: "1"
            limits:
              memory: "4Gi" 
              cpu: "2"
        
        # PostgreSQL 설정
        postgresql:
          enabled: true
          auth:
            username: airflow
            database: airflow
            
        # Redis 설정  
        redis:
          enabled: true
          
  destination:
    server: https://kubernetes.default.svc
    namespace: airflow
    
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
    - CreateNamespace=true
    - ServerSideApply=true

4.3 Kustomize Application

# spark-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: spark-jobs
  namespace: argocd
spec:
  project: data-engineering
  
  source:
    repoURL: https://github.com/your-org/data-infrastructure
    path: spark/overlays/production
    targetRevision: main
    
  destination:
    server: https://kubernetes.default.svc
    namespace: spark-operator
    
  syncPolicy:
    automated:
      prune: false  # Spark Job 데이터 보호
      selfHeal: true
    syncOptions:
    - CreateNamespace=true
    retry:
      limit: 3
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m

4.4 일반 YAML 매니페스트 Application

# kafka-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: kafka-cluster
  namespace: argocd
spec:
  project: data-engineering
  
  source:
    repoURL: https://github.com/your-org/data-infrastructure
    path: kafka/production
    targetRevision: main
    directory:
      recurse: true
      jsonnet: {}
      
  destination:
    server: https://kubernetes.default.svc
    namespace: kafka
    
  syncPolicy:
    syncOptions:
    - CreateNamespace=true
    - ApplyOutOfSyncOnly=true

5. Application 관리 작업

5.1 Application 배포

# Application 적용
kubectl apply -f airflow-app.yaml

# 또는 CLI로 직접 생성 후 동기화
argocd app sync airflow-production

# 특정 리소스만 동기화
argocd app sync airflow-production --resource apps:Deployment:airflow-webserver

5.2 Application 상태 확인

# 전체 Application 목록
argocd app list

# 특정 Application 상세 정보
argocd app get airflow-production

# Application 로그 확인
argocd app logs airflow-production

# 리소스 트리 확인
argocd app resources airflow-production

5.3 Application 동기화 정책 변경

# 자동 동기화 활성화
argocd app set airflow-production --sync-policy automated

# 자동 정리(prune) 활성화
argocd app set airflow-production --auto-prune

# 자동 복구(self-heal) 활성화  
argocd app set airflow-production --self-heal

6. 실제 사용 시나리오

6.1 시나리오 1: Airflow DAG 배포

1단계: Git 저장소 구조

airflow-dags/
├── dags/
│   ├── data_ingestion_dag.py
│   ├── data_transformation_dag.py
│   └── data_export_dag.py
├── k8s/
│   ├── configmap.yaml
│   └── deployment.yaml
└── values/
    ├── dev-values.yaml
    ├── staging-values.yaml
    └── prod-values.yaml

 

2단계: ConfigMap으로 DAG 배포 설정

# k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: airflow-dags
  namespace: airflow
data:
  data_ingestion_dag.py: |
    from datetime import datetime, timedelta
    from airflow import DAG
    from airflow.operators.python import PythonOperator
    
    default_args = {
        'owner': 'data-team',
        'depends_on_past': False,
        'start_date': datetime(2024, 1, 1),
        'email_on_failure': True,
        'email_on_retry': False,
        'retries': 1,
        'retry_delay': timedelta(minutes=5),
    }
    
    dag = DAG(
        'data_ingestion',
        default_args=default_args,
        description='데이터 수집 파이프라인',
        schedule_interval=timedelta(hours=1),
        catchup=False,
    )
    
    def extract_data():
        # 데이터 추출 로직
        pass
    
    extract_task = PythonOperator(
        task_id='extract_data',
        python_callable=extract_data,
        dag=dag,
    )

 

3단계: Application 생성

argocd app create airflow-dags \
  --repo https://github.com/your-org/airflow-dags \
  --path k8s \
  --dest-server https://kubernetes.default.svc \
  --dest-namespace airflow \
  --project data-engineering \
  --sync-policy automated

6.2 시나리오 2: Spark Application 배포

1단계: Spark Operator 배포

# spark-operator-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: spark-operator
  namespace: argocd
spec:
  project: data-engineering
  source:
    repoURL: https://googlecloudplatform.github.io/spark-on-k8s-operator
    chart: spark-operator
    targetRevision: 1.1.27
    helm:
      values: |
        sparkJobNamespace: spark-jobs
        enableWebhook: true
        enableMetrics: true
  destination:
    server: https://kubernetes.default.svc
    namespace: spark-operator
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
    - CreateNamespace=true

 

2단계: Spark Job 정의

# spark-jobs/data-processing-job.yaml
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  name: data-processing-job
  namespace: spark-jobs
spec:
  type: Scala
  mode: cluster
  image: "your-registry/spark-app:v1.0.0"
  imagePullPolicy: Always
  mainClass: com.company.DataProcessingJob
  mainApplicationFile: "local:///opt/spark/examples/jars/spark-app.jar"
  
  sparkVersion: "3.3.0"
  
  deps:
    jars:
    - "https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/3.3.1/hadoop-aws-3.3.1.jar"
    
  driver:
    cores: 1
    coreLimit: "1200m"
    memory: "2g"
    labels:
      version: "3.3.0"
    serviceAccount: spark-driver
    
  executor:
    cores: 2
    instances: 3
    memory: "4g"
    labels:
      version: "3.3.0"
      
  restartPolicy:
    type: OnFailure
    onFailureRetries: 2
    onFailureRetryInterval: 10
    onSubmissionFailureRetries: 1

6.3 시나리오 3: 환경별 배포 관리

1단계: 환경별 Application 생성

# 개발환경
argocd app create airflow-dev \
  --repo https://github.com/your-org/data-infrastructure \
  --path airflow/overlays/dev \
  --dest-server https://dev-cluster \
  --dest-namespace airflow \
  --project data-engineering \
  --sync-policy automated

# 스테이징환경  
argocd app create airflow-staging \
  --repo https://github.com/your-org/data-infrastructure \
  --path airflow/overlays/staging \
  --dest-server https://staging-cluster \
  --dest-namespace airflow \
  --project data-engineering

# 프로덕션환경
argocd app create airflow-prod \
  --repo https://github.com/your-org/data-infrastructure \
  --path airflow/overlays/prod \
  --dest-server https://prod-cluster \
  --dest-namespace airflow \
  --project data-engineering

7. 모니터링 및 디버깅

7.1 Application 이벤트 확인

# Application 이벤트 확인
argocd app get airflow-production --show-operation

# 동기화 히스토리 확인
argocd app history airflow-production

# 특정 리비전 상세 확인
argocd app manifests airflow-production --revision 5

7.2 리소스 상태 디버깅

# 특정 리소스의 상태 확인
kubectl describe deployment airflow-webserver -n airflow

# Pod 로그 확인
kubectl logs -f deployment/airflow-scheduler -n airflow

# 이벤트 확인
kubectl get events -n airflow --sort-by='.lastTimestamp'

7.3 ArgoCD 자체 문제 해결

# ArgoCD 컨트롤러 로그 확인
kubectl logs -f deployment/argocd-application-controller -n argocd

# ArgoCD 서버 로그 확인  
kubectl logs -f deployment/argocd-server -n argocd

# Repository 연결 문제 확인
argocd repo get https://github.com/your-org/repo --refresh

8. 보안 및 Best Practices

8.1 RBAC 설정

# argocd-rbac-cm ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-rbac-cm
  namespace: argocd
data:
  policy.default: role:readonly
  policy.csv: |
    # 데이터 엔지니어링 팀 권한
    p, role:data-engineer, applications, *, data-engineering/*, allow
    p, role:data-engineer, certificates, *, *, allow
    p, role:data-engineer, repositories, *, *, allow
    
    # 팀 그룹 매핑
    g, data-engineering-team, role:data-engineer
    g, data-engineering-leads, role:admin

8.2 Secret 관리

# Sealed Secrets 사용
kubectl create secret generic db-credentials \
  --from-literal=username=admin \
  --from-literal=password=secret123 \
  --dry-run=client -o yaml | \
  kubeseal -o yaml > sealed-secret.yaml

8.3 보안 Best Practices

  • Private 저장소 사용 시 SSH 키 방식 권장
  • 민감한 정보는 Sealed Secrets 또는 External Secrets Operator 사용
  • 프로덕션 환경은 수동 승인 후 배포
  • 정기적인 Access Review 수행
  • Git 커밋 서명 활용

이 가이드를 통해 ArgoCD를 활용한 데이터 파이프라인 배포 및 관리의 전체 워크플로우를 이해하고 실제 업무에 적용할 수 있습니다.

반응형

'Data Platform > ArgoCD' 카테고리의 다른 글

[ArgoCD] 클러스터 아키텍처 w.장애 극복 시나리오  (0) 2025.06.23
[ArgoCD] ArgoCD란?  (0) 2025.06.23

+ Recent posts