증상
multi-thread 환경에서 libcurl 사용 시 시스템 죽음
[Thu Jun 09 09:31:35.063 2016] * Curl_ipv4_resolve_r failed for tunnel-manager.xxxxxxxx.com
[Thu Jun 09 09:31:35.063 2016] * Couldn't resolve host 'tunnel-manager.xxxxxxxx.com'
[Thu Jun 09 09:31:35.063 2016] * Closing connection 0
[Thu Jun 09 09:31:35.063 2016] curl_easy_perform() failed: Couldn't resolve host name
[Thu Jun 09 09:31:35.063 2016] * Hostname was NOT found in DNS cache
[Thu Jun 09 09:31:50.070 2016] * Previous alarm fired off!
[Thu Jun 09 09:31:50.092 2016] * Closing connection 0
[Thu Jun 09 09:31:50.092 2016] curl_easy_perform() failed: Timeout was reached
[Thu Jun 09 09:31:51.157 2016] * name lookup timed out
[Thu Jun 09 09:31:51.157 2016] * Previous alarm fired off!
[Thu Jun 09 09:31:51.157 2016] Got signal 11 [SIGSEGV]
[Thu Jun 09 09:31:51.157 2016] backtrace stack : 1(0xb59826e0)
[Thu Jun 09 09:31:51.157 2016] [bt] Execution path:
[Thu Jun 09 09:31:51.157 2016] [bt] [ 1] [0xb59826e0] /lib/libc.so.6(__default_rt_sa_restorer_v2+0) [0xb59826e0]
[Thu Jun 09 09:31:51.157 2016] [bt] [ 2] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] [ 3] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] [ 4] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] [ 5] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] [ 6] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] [ 7] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] [ 8] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] [ 9] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] [10] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] [11] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] [12] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] [13] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] [14] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] [15] [0x00000000] [(nil)]
[Thu Jun 09 09:31:51.157 2016] [bt] Execution path:
[Thu Jun 09 09:31:51.425 2016] MMB LEAK(pid=1102): 0x9A452000, 249856 bytes, ''
[Thu Jun 09 09:31:51.477 2016] mmz_userdev_release: mmb<0x9A452000> mapped to userspace 0x9f062000 will be force unmaped!
[Thu Jun 09 09:31:51.477 2016] MMB LEAK(pid=1102): 0x9A48F000, 32768 bytes, ''
[Thu Jun 09 09:31:51.477 2016] mmz_userdev_release: mmb<0x9A48F000> mapped to userspace 0xb17dc000 will be force unmaped!
[Thu Jun 09 09:31:51.477 2016] MMB LEAK(pid=1102): 0x9A4B1000, 65536 bytes, ''
[Thu Jun 09 09:31:51.477 2016] mmz_userdev_release: mmb<0x9A4B1000> mapped to userspace 0xb1724000 will be force unmaped!
원인
- DNS lookups이 너무 오래 걸리면 curl은 SIGALRM 시그널 발생시킴
- Signal은 thread-safe 하지 않기 때문에 multi-thread 환경에서 crash 발생시킬 수 있음 (ex. 잘못된 thread로 복귀)
int Curl_resolv_timeout(struct connectdata *conn,
const char *hostname,
int port,
struct Curl_dns_entry **entry,
long timeoutms)
{
//
// 1. 현 시점의 stack env 저장 (복귀 지점 설정)
//
if(sigsetjmp(curl_jmpenv, 1)) {
//
// 5. siglongjmp() 호출 시 복귀 되는 지점
//
/* this is coming from a siglongjmp() after an alarm signal */
failf(data, "name lookup timed out");
rc = CURLRESOLV_ERROR;
goto clean_up;
}
else {
//
// 2. SIGALRM에 대한 signal handler 등록
//
sigaction(SIGALRM, NULL, &sigact);
keep_sigact = sigact;
keep_copysig = TRUE; /* yes, we have a copy */
sigact.sa_handler = alarmfunc;
}
//
// 3. DNS lookup 실행
// => 타임아웃 시 SIGALRM 시그널 발생
//
rc = Curl_resolv(conn, hostname, port, entry);
...
}
static
RETSIGTYPE alarmfunc(int sig)
{
//
// 4. SIGALRM 시그널 발생하면, 저장된 stack env로 점프
//
(void)sig;
siglongjmp(curl_jmpenv, 1);
return;
}
대책
- CURLOPT_NOSIGNAL 옵션을 1로 설정하여 SIGALRM 발생시키지 않도록 변경
- curl_easy_setopt(curl, CURLOPT_NOSIGNAL, 1L);
- Async DNS 백엔드를 사용하도록 libcurl 컴파일
- threaded-resolver
- c-ares resolver (C library for asynchronous DNS requests)
기타
- curl_global_init을 명시적으로 1회 호출할 것
curl_global_init을 호출하지 않더라도 curl_easy_init 최초 호출 시 자동으로 호출되긴 하지만 multi-thread 환경에서는 중복으로 호출되어 충돌이 발생할 수 있음
참고
- http://stackoverflow.com/questions/9191668/error-longjmp-causes-uninitialized-stack-frame
- https://curl.haxx.se/libcurl/c/threadsafe.html
- https://www.redhat.com/archives/libvir-list/2012-September/msg01960.html
- http://stackoverflow.com/questions/20835172/libcurl-strange-crashes-after-idle-time