### Summary
An exploitable out-of-bounds write vulnerability exists in the xls_mergedCells function of libxls 1.4. A specially crafted XLS file can cause a memory corruption resulting in remote code execution. An attacker can send malicious xls file to trigger this vulnerability.
### Tested Versions
libxls 1.4 readxl package 1.0.0 for R (tested using Microsoft R 4.3.1)
### Product URLs
http://libxls.sourceforge.net/
### CVSSv3 Score
8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
### CWE
CWE-787: Out-of-bounds Write
### Details
libxls is a C library supported on Windows, Mac and Linux which can read Microsoft Excel File Format (XLS) files. The library is used by the readxl package that can be installed in the R programming language. An out-of-bounds write appears in the xls_mergedCells function. Let's take a look at the vulnerable code:
```
Line 606 void xls_mergedCells(xlsWorkSheet* pWS,BOF* bof,BYTE* buf)
Line 607 {
Line 608 int count=*((WORD*)buf);
Line 609 int i,c,r;
Line 610 struct MERGEDCELLS* span;
Line 611 verbose("Merged Cells");
Line 612 for (i=0;i<count;i++)
Line 613 {
Line 614 span=(struct MERGEDCELLS*)(buf+(2+i*sizeof(struct MERGEDCELLS)));
Line 615 // printf("Merged Cells: [%i,%i] [%i,%i] \n",span->colf,span->rowf,span->coll,span->rowl);
Line 616 for (r=span->rowf;r<=span->rowl;r++)
Line 617 for (c=span->colf;c<=span->coll;c++)
Line 618 pWS->rows.row[r].cells.cell[c].ishiden=1;
Line 619 pWS->rows.row[span->rowf].cells.cell[span->colf].colspan=(span->coll-span->colf+1);
Line 620 pWS->rows.row[span->rowf].cells.cell[span->colf].rowspan=(span->rowl-span->rowf+1);
Line 621 pWS->rows.row[span->rowf].cells.cell[span->colf].ishiden=0;
Line 622 }
Line 623 }
```
Important variables and especially their content are: `buf` and `bof` which have been read in raw form from a file. We see at `line 612` that the `count` value, which is exactly `bof.size`, controls a loop. Next further parts of the `buf` buffer are pointed to by the `span` variable at `line 614`. Because the `span` structure is based on data directly read from file, an attacker not only fully controls the amount of executions of the `for loops` at lines 616 and 617 but also the offsets during writes to the `pWs->rows` structure. Using our PoC we can observe the following values during a crash:
```
Starting program: /home/icewall/bugs/libxls-1.4.0/build/bin/xls2csv ./crashes/49a5608059427ce2f2c479e33c5e3ae4
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7bd1e57 in xls_mergedCells (pWS=0x605830, bof=0x7fffffffdc10, buf=0x607230 "\b") at xls.c:619
619 pWS->rows.row[span->rowf].cells.cell[span->colf].colspan=(span->coll-span->colf+1);
(gdb) p/x *span
$1 = {rowf = 0xabcd, rowl = 0x1122, colf = 0x3344, coll = 0x6655}
(gdb) p/x *pWS->rows.row
$3 = {index = 0x0, fcell = 0x0, lcell = 0x0, height = 0x0, flags = 0x0, xf = 0x0, xfflags = 0x0, cells = {count = 0x0, cell = 0x607280}}
```
MergedCell record starts at offset : 7ED4Ch And has form : `[BOF][SPAN][SPAN]...BOF.size*sizeof(MERGEDCELLS)...[SPAN]`
### Crash Information
```
Crash in the Microsoft R platform:
> library(readxl)
> path <- readxl_example("49a5608059427ce2f2c479e33c5e3ae4.xls")
> lapply(excel_sheets(path), read_excel, path = path)
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
*** caught segfault ***
address 0x54, cause 'memory not mapped'
Traceback:
1: .Call("readxl_read_xls_", PACKAGE = "readxl", path, sheet_i, limits, shim, col_names, col_types, na, trim_ws, guess_max)
2: read_fun(path = path, sheet = sheet, limits = limits, shim = shim, col_names = col_names, col_types = col_types, na = na, trim_ws = trim_ws, guess_max = guess_max)
3: tibble::as_tibble(read_fun(path = path, sheet = sheet, limits = limits, shim = shim, col_names = col_names, col_types = col_types, na = na, trim_ws = trim_ws,
guess_max = guess_max), validate = FALSE)
4: tibble::repair_names(tibble::as_tibble(read_fun(path = path, sheet = sheet, limits = limits, shim = shim, col_names = col_names, col_types = col_types, na = na,
trim_ws = trim_ws, guess_max = guess_max), validate = FALSE), prefix = "X", sep = "__")
5: read_excel_(path = path, sheet = sheet, range = range, col_names = col_names, col_types = col_types, na = na, trim_ws = trim_ws, skip = skip, n_max = n_max, guess_max
= guess_max, excel_format(path))
6: FUN(X[[i]], ...)
7: lapply(excel_sheets(path), read_excel, path = path)
directly in libxls lib:
==70269== Memcheck, a memory error detector
==70269== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==70269== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==70269== Command: ./xls2csv ./crashes/49a5608059427ce2f2c479e33c5e3ae4
==70269==
==70269== Invalid write of size 1
==70269== at 0x4C3106F: strcpy (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==70269== by 0x4E42F05: xls_open (xls.c:927)
==70269== by 0x400956: main (xls2csv.c:45)
==70269== Address 0x5425415 is 0 bytes after a block of size 21 alloc'd
==70269== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==70269== by 0x4E42EE3: xls_open (xls.c:926)
==70269== by 0x400956: main (xls2csv.c:45)
==70269==
==70269== Invalid read of size 8
==70269== at 0x4E41E57: xls_mergedCells (xls.c:619)
==70269== by 0x4E42CBC: xls_parseWorkSheet (xls.c:861)
==70269== by 0x400AEC: main (xls2csv.c:90)
==70269== Address 0x55e8b66 is 1,095,926 bytes inside an unallocated block of size 3,358,064 in arena "client"
==70269==
==70269== Invalid write of size 2
==70269== at 0x4E41E92: xls_mergedCells (xls.c:619)
==70269== by 0x4E42CBC: xls_parseWorkSheet (xls.c:861)
==70269== by 0x400AEC: main (xls2csv.c:90)
==70269== Address 0x7cf7f is not stack'd, malloc'd or (recently) free'd
```
### Timeline
* 2017-08-29 - Vendor Disclosure
* 2017-11-14 - Public Release
暂无评论