반응형
ArgoCD 사용 방법 완전 가이드
1. ArgoCD 설치 및 초기 설정
1.1 ArgoCD 설치
# ArgoCD 네임스페이스 생성
kubectl create namespace argocd
# ArgoCD 설치
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# 설치 확인
kubectl get pods -n argocd
1.2 ArgoCD CLI 설치
# Linux/macOS
curl -sSL -o argocd-linux-amd64 https://github.com/argoproj/argo-cd/releases/latest/download/argocd-linux-amd64
sudo install -m 555 argocd-linux-amd64 /usr/local/bin/argocd
# 또는 Homebrew (macOS)
brew install argocd
1.3 초기 접근 설정
# ArgoCD Server를 외부에서 접근 가능하도록 설정 (개발환경)
kubectl patch svc argocd-server -n argocd -p '{"spec":{"type":"LoadBalancer"}}'
# 또는 포트 포워딩 사용
kubectl port-forward svc/argocd-server -n argocd 8080:443
# 초기 admin 패스워드 확인
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d
1.4 CLI 로그인
# 로그인
argocd login localhost:8080
# 패스워드 변경
argocd account update-password
2. Repository 등록
2.1 Public Repository 등록
# CLI로 등록
argocd repo add https://github.com/your-org/k8s-manifests
# YAML로 등록
kubectl apply -f - <https://github.com/your-org/k8s-manifests
EOF
2.2 Private Repository 등록
# SSH 키 방식
argocd repo add git@github.com:your-org/private-repo.git \
--ssh-private-key-path ~/.ssh/id_rsa
# HTTPS 토큰 방식
argocd repo add https://github.com/your-org/private-repo.git \
--username your-username \
--password ghp_your-token
2.3 Repository 등록 확인
# 등록된 저장소 목록 확인
argocd repo list
# 연결 테스트
argocd repo get https://github.com/your-org/k8s-manifests
3. Project 생성 및 관리
3.1 기본 Project 생성
# CLI로 Project 생성
argocd proj create data-engineering \
--description "데이터 엔지니어링 팀 프로젝트" \
--src "https://github.com/your-org/*" \
--dest "https://kubernetes.default.svc,data-*"
3.2 YAML로 Project 정의
# data-engineering-project.yaml
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
name: data-engineering
namespace: argocd
spec:
description: "데이터 엔지니어링 팀 프로젝트"
# 허용된 소스 저장소
sourceRepos:
- 'https://github.com/your-org/data-infrastructure'
- 'https://github.com/your-org/airflow-dags'
- 'https://helm-charts.bitnami.com/*'
# 배포 가능한 대상
destinations:
- namespace: 'data-*'
server: https://kubernetes.default.svc
- namespace: airflow
server: https://kubernetes.default.svc
- namespace: kafka
server: https://kubernetes.default.svc
# 클러스터 리소스 권한
clusterResourceWhitelist:
- group: ''
kind: PersistentVolume
- group: 'storage.k8s.io'
kind: StorageClass
# 네임스페이스 리소스 권한
namespaceResourceWhitelist:
- group: 'apps'
kind: Deployment
- group: 'apps'
kind: StatefulSet
- group: 'batch'
kind: Job
- group: 'batch'
kind: CronJob
- group: ''
kind: Service
- group: ''
kind: ConfigMap
- group: ''
kind: Secret
# 거부할 리소스
namespaceResourceBlacklist:
- group: ''
kind: ResourceQuota
# 역할 기반 접근 제어
roles:
- name: data-engineer
description: 데이터 엔지니어 역할
policies:
- p, proj:data-engineering:data-engineer, applications, *, data-engineering/*, allow
- p, proj:data-engineering:data-engineer, applications, sync, data-engineering/*, allow
groups:
- data-engineering-team
- name: data-engineer-readonly
description: 읽기 전용 접근
policies:
- p, proj:data-engineering:data-engineer-readonly, applications, get, data-engineering/*, allow
groups:
- data-engineering-viewers
# Project 적용
kubectl apply -f data-engineering-project.yaml
# Project 확인
argocd proj get data-engineering
4. Application 생성 및 배포
4.1 간단한 Application 생성
# CLI로 Application 생성
argocd app create my-app \
--repo https://github.com/your-org/k8s-manifests \
--path manifests/my-app \
--dest-server https://kubernetes.default.svc \
--dest-namespace default \
--project data-engineering
4.2 Helm Chart Application
# airflow-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: airflow-production
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: data-engineering
source:
repoURL: https://airflow.apache.org/
chart: airflow
targetRevision: 1.7.0
helm:
values: |
# Airflow 설정
webserver:
service:
type: LoadBalancer
# 워커 설정
workers:
replicas: 3
resources:
requests:
memory: "2Gi"
cpu: "1"
limits:
memory: "4Gi"
cpu: "2"
# PostgreSQL 설정
postgresql:
enabled: true
auth:
username: airflow
database: airflow
# Redis 설정
redis:
enabled: true
destination:
server: https://kubernetes.default.svc
namespace: airflow
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
- ServerSideApply=true
4.3 Kustomize Application
# spark-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: spark-jobs
namespace: argocd
spec:
project: data-engineering
source:
repoURL: https://github.com/your-org/data-infrastructure
path: spark/overlays/production
targetRevision: main
destination:
server: https://kubernetes.default.svc
namespace: spark-operator
syncPolicy:
automated:
prune: false # Spark Job 데이터 보호
selfHeal: true
syncOptions:
- CreateNamespace=true
retry:
limit: 3
backoff:
duration: 5s
factor: 2
maxDuration: 3m
4.4 일반 YAML 매니페스트 Application
# kafka-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: kafka-cluster
namespace: argocd
spec:
project: data-engineering
source:
repoURL: https://github.com/your-org/data-infrastructure
path: kafka/production
targetRevision: main
directory:
recurse: true
jsonnet: {}
destination:
server: https://kubernetes.default.svc
namespace: kafka
syncPolicy:
syncOptions:
- CreateNamespace=true
- ApplyOutOfSyncOnly=true
5. Application 관리 작업
5.1 Application 배포
# Application 적용
kubectl apply -f airflow-app.yaml
# 또는 CLI로 직접 생성 후 동기화
argocd app sync airflow-production
# 특정 리소스만 동기화
argocd app sync airflow-production --resource apps:Deployment:airflow-webserver
5.2 Application 상태 확인
# 전체 Application 목록
argocd app list
# 특정 Application 상세 정보
argocd app get airflow-production
# Application 로그 확인
argocd app logs airflow-production
# 리소스 트리 확인
argocd app resources airflow-production
5.3 Application 동기화 정책 변경
# 자동 동기화 활성화
argocd app set airflow-production --sync-policy automated
# 자동 정리(prune) 활성화
argocd app set airflow-production --auto-prune
# 자동 복구(self-heal) 활성화
argocd app set airflow-production --self-heal
6. 실제 사용 시나리오
6.1 시나리오 1: Airflow DAG 배포
1단계: Git 저장소 구조
airflow-dags/
├── dags/
│ ├── data_ingestion_dag.py
│ ├── data_transformation_dag.py
│ └── data_export_dag.py
├── k8s/
│ ├── configmap.yaml
│ └── deployment.yaml
└── values/
├── dev-values.yaml
├── staging-values.yaml
└── prod-values.yaml
2단계: ConfigMap으로 DAG 배포 설정
# k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: airflow-dags
namespace: airflow
data:
data_ingestion_dag.py: |
from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.python import PythonOperator
default_args = {
'owner': 'data-team',
'depends_on_past': False,
'start_date': datetime(2024, 1, 1),
'email_on_failure': True,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5),
}
dag = DAG(
'data_ingestion',
default_args=default_args,
description='데이터 수집 파이프라인',
schedule_interval=timedelta(hours=1),
catchup=False,
)
def extract_data():
# 데이터 추출 로직
pass
extract_task = PythonOperator(
task_id='extract_data',
python_callable=extract_data,
dag=dag,
)
3단계: Application 생성
argocd app create airflow-dags \
--repo https://github.com/your-org/airflow-dags \
--path k8s \
--dest-server https://kubernetes.default.svc \
--dest-namespace airflow \
--project data-engineering \
--sync-policy automated
6.2 시나리오 2: Spark Application 배포
1단계: Spark Operator 배포
# spark-operator-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: spark-operator
namespace: argocd
spec:
project: data-engineering
source:
repoURL: https://googlecloudplatform.github.io/spark-on-k8s-operator
chart: spark-operator
targetRevision: 1.1.27
helm:
values: |
sparkJobNamespace: spark-jobs
enableWebhook: true
enableMetrics: true
destination:
server: https://kubernetes.default.svc
namespace: spark-operator
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
2단계: Spark Job 정의
# spark-jobs/data-processing-job.yaml
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: data-processing-job
namespace: spark-jobs
spec:
type: Scala
mode: cluster
image: "your-registry/spark-app:v1.0.0"
imagePullPolicy: Always
mainClass: com.company.DataProcessingJob
mainApplicationFile: "local:///opt/spark/examples/jars/spark-app.jar"
sparkVersion: "3.3.0"
deps:
jars:
- "https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/3.3.1/hadoop-aws-3.3.1.jar"
driver:
cores: 1
coreLimit: "1200m"
memory: "2g"
labels:
version: "3.3.0"
serviceAccount: spark-driver
executor:
cores: 2
instances: 3
memory: "4g"
labels:
version: "3.3.0"
restartPolicy:
type: OnFailure
onFailureRetries: 2
onFailureRetryInterval: 10
onSubmissionFailureRetries: 1
6.3 시나리오 3: 환경별 배포 관리
1단계: 환경별 Application 생성
# 개발환경
argocd app create airflow-dev \
--repo https://github.com/your-org/data-infrastructure \
--path airflow/overlays/dev \
--dest-server https://dev-cluster \
--dest-namespace airflow \
--project data-engineering \
--sync-policy automated
# 스테이징환경
argocd app create airflow-staging \
--repo https://github.com/your-org/data-infrastructure \
--path airflow/overlays/staging \
--dest-server https://staging-cluster \
--dest-namespace airflow \
--project data-engineering
# 프로덕션환경
argocd app create airflow-prod \
--repo https://github.com/your-org/data-infrastructure \
--path airflow/overlays/prod \
--dest-server https://prod-cluster \
--dest-namespace airflow \
--project data-engineering
7. 모니터링 및 디버깅
7.1 Application 이벤트 확인
# Application 이벤트 확인
argocd app get airflow-production --show-operation
# 동기화 히스토리 확인
argocd app history airflow-production
# 특정 리비전 상세 확인
argocd app manifests airflow-production --revision 5
7.2 리소스 상태 디버깅
# 특정 리소스의 상태 확인
kubectl describe deployment airflow-webserver -n airflow
# Pod 로그 확인
kubectl logs -f deployment/airflow-scheduler -n airflow
# 이벤트 확인
kubectl get events -n airflow --sort-by='.lastTimestamp'
7.3 ArgoCD 자체 문제 해결
# ArgoCD 컨트롤러 로그 확인
kubectl logs -f deployment/argocd-application-controller -n argocd
# ArgoCD 서버 로그 확인
kubectl logs -f deployment/argocd-server -n argocd
# Repository 연결 문제 확인
argocd repo get https://github.com/your-org/repo --refresh
8. 보안 및 Best Practices
8.1 RBAC 설정
# argocd-rbac-cm ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-rbac-cm
namespace: argocd
data:
policy.default: role:readonly
policy.csv: |
# 데이터 엔지니어링 팀 권한
p, role:data-engineer, applications, *, data-engineering/*, allow
p, role:data-engineer, certificates, *, *, allow
p, role:data-engineer, repositories, *, *, allow
# 팀 그룹 매핑
g, data-engineering-team, role:data-engineer
g, data-engineering-leads, role:admin
8.2 Secret 관리
# Sealed Secrets 사용
kubectl create secret generic db-credentials \
--from-literal=username=admin \
--from-literal=password=secret123 \
--dry-run=client -o yaml | \
kubeseal -o yaml > sealed-secret.yaml
8.3 보안 Best Practices
- Private 저장소 사용 시 SSH 키 방식 권장
- 민감한 정보는 Sealed Secrets 또는 External Secrets Operator 사용
- 프로덕션 환경은 수동 승인 후 배포
- 정기적인 Access Review 수행
- Git 커밋 서명 활용
이 가이드를 통해 ArgoCD를 활용한 데이터 파이프라인 배포 및 관리의 전체 워크플로우를 이해하고 실제 업무에 적용할 수 있습니다.
반응형
'Data Platform > ArgoCD' 카테고리의 다른 글
[ArgoCD] 클러스터 아키텍처 w.장애 극복 시나리오 (0) | 2025.06.23 |
---|---|
[ArgoCD] ArgoCD란? (0) | 2025.06.23 |