编程知识 cdmana.com

IOS symbol parsing and reconstruction

 picture
One 、 background

1.1 What is symbolic parsing

The so-called symbol parsing is to map the address in the crash log into readable symbols and line numbers in the source file , It is convenient for developers to locate and fix problems . Here's the picture , The first completely unreadable crash log becomes the third fully readable log after complete symbol parsing . For byte stability monitoring platform , Need to support iOS End crash / stuck / Carton / Inverse solutions of various log types such as custom exceptions , Therefore, symbol parsing is also a necessary underlying basic capability of the monitoring platform .

 picture

1.2 System native symbol parsing tool

symbolicatecrash

Xcode Provided symbolicatecrash. The order is located at :/Applications/Xcode.app/Contents/SharedFrameworks/DVTFoundation.framework/Versions/A/Resources/symbolicatecrash, It's a perl Script , It integrates step-by-step parsing operations ( You can also copy commands , Call directly ).

usage :symbolicatecrash log.crash -d xxx.app.dSYM

advantage : It is very convenient to symbolize the whole copy crash journal .

shortcoming :

  1. It takes a long time .

  2. The particle size is relatively coarse , A specific row cannot be symbolized .

atos

usage :atos -o xxx.app.dSYM/Contents/Resources/DWARF/xxx -arch arm64/armv7 -l loadAddress runtimeAddress

advantage : Fast , You can symbolize a specific line , It is convenient for the upper layer to cache .

1.3 Problems with native tools

But both of the above tools have two biggest defects, which are :

  1. Are just stand-alone tools , Cannot be provided as an online service .
  2. Must depend on macOS System , because The infrastructure of byte server is all based on Linux, As a result, various platforms and frameworks of the group cannot be reused , This brings a very high machine cost , Deployment cost and operation and maintenance cost .

Two 、 Exploration of historical scheme

In order to solve these two pain points , Build a set of Linux Available on iOS Online symbol parsing service , In history, we have made the following explorations in turn :

programme 1:llvm-atosl

 picture
This is actually based on llvm The built-in symbol analysis tool has made some customized modifications .
The online parsing flow chart of single line log is as follows :

 picture

The plan didn't have much problem at first , But over time , During the late peak period, it often occurs that the resolution fails due to the resolution timeout, and then only the address offset can be seen without the symbol , Therefore, we need to find the bottleneck and further optimize .

programme 2:llvm-atosl-cgo

It's really just the llvm-atosl Tool pass cgo Instead of calling from the command line .
programme 1 After the launch, we observed a single line resolution during the evening peak pct99 Very exaggerated , More and more parsing failures are caused by timeout , There was even a time when the whole service was tamped directly during the evening peak , Log in to the online machine and see a lot too many open files Report errors , I suspected it was fd Occupation exceeds the upper limit , And think of every execution llvm-atosl Scripts take at least 3 individual fd(stdin,stdout and stderr), So we try to llvm-atosl From the form of command line tools, it is encapsulated into a c Of library, Re pass cgo stay golang Side of the call :
package main

/*
#cgo CFLAGS: -I./tools
#cgo LDFLAGS: -lstdc++ -lncurses -lm -L${SRCDIR}/tools/ -lllvm-atosl
#include "llvm-atosl-api.h"
#include <stdlib.h>
*/

import "C"

import (
  "fmt"
  "strconv"
  "strings"
  "unsafe"
)

func main() {
    result = symbolicate("~/dsym/7.8.0(78007)eb7dd4d73df0329692003523fc2c9586/Aweme.app.dSYM/Contents/Resources/DWARF/Aweme","arm64","0x100008000","0x0000000102cff4b8");
    fmt.Println(result)
}

func symbolicate(go_path string, go_arch string, go_loadAddress string, go_address string) string {
    c_path := C.CString(go_path)
    c_arch := C.CString(go_arch)

    loadAddress := hex2int(go_loadAddress)
    c_loadAddress := C.ulong(loadAddress)

    address := hex2int(go_address)
    c_address := C.ulong(address)

    c_result := C.getSymbolicatedName(c_path, c_arch, c_loadAddress, c_address)

    result := C.GoString(c_result)

    C.free(unsafe.Pointer(c_path))
    C.free(unsafe.Pointer(c_arch))
    C.free(unsafe.Pointer(c_result))

    return result;
}

func hex2int(hexStr string) uint64 {
     // remove 0x suffix if found in the input string
     cleaned := strings.Replace(hexStr, "0x"""-1)

     // base 16 for hexadecimal
     result, _ := strconv.ParseUint(cleaned, 1664)
     return uint64(result)
 }
It was supposed to switch from cross process call to in-process call , Can be reduced at the same time fd Occupation and overhead of interprocess communication , However, the efficiency of parsing has not been improved after online , It's down .
Refer to a blog 《 How to make Go call C Performance improvement of 10 times ?》( See resources for links [1]) Conclusion in , cgo Two reasons for poor performance :
  1. Thread stack Go The runtime is relatively small , suffer P(Processor, It can be understood as goroutine Management dispatcher ) as well as M(Machine, It can be understood as physical thread ) The limit of quantity , Generally, it can be simply understood as being affected GOMAXPROCS Limit ,go 1.5 After version GOMAXPROCS The default is machine CPU Check the number , So once cgo The number of methods called concurrently exceeds GOMAXPROCS , Call blocking occurs .
  2. Due to the need to keep C/C++ Runtime , cgo Need to be in two runtimes and two ABI( Abstract binary interface ) Translation and coordination between . This brings a lot of overhead .
This shows that about fd The conjecture of excessive occupation and performance bottleneck of cross process calls is not tenable , Therefore, this scheme has also proved to be infeasible .

programme 3:golang-atos

be based on golang Native system library debug/dwarf, Can realize the right DWARF File analysis , Resolve address to symbol , Can replace llvm-atosl The implementation of the , And can be used naturally golang The characteristics of coroutine realize high concurrency . The implementation scheme can refer to the following source code :

package dwarfexample
import (
    "debug/macho"
    "debug/dwarf"
    "log"
    "github.com/go-errors/errors")
func ParseFile(path string, address int64) (err error) {
    var f *macho.FatFile
    if f, err = macho.OpenFat(path); err != nil {
        return errors.New("open file error: " + err.Error())
    }

    var d *dwarf.Data
    if d, err = f.Arches[1].DWARF(); err != nil {
        return
    }

    r := d.Reader()

    var entry *dwarf.Entry
    if entry, err = r.SeekPC(address); err != nil {
        log.Print("Not Found ...")
        return
    } else {
        log.Print("Found ...")
    }

    log.Printf("tag: %+v, lowpc: %+v", entry.Tag, entry.Val(dwarf.AttrLowpc))

    var lineReader *dwarf.LineReader
    if lineReader, err = d.LineReader(entry); err != nil {
        return
    }

    var line dwarf.LineEntry

    if err = lineReader.SeekPC(0x1005AC550, &line); err != nil {
        return
    }

    log.Printf("line %+v:%+v", line.File.Name, line.Line)

    return
}
But it was found during unit testing golang-atos Efficiency ratio of single line parsing llvm-atosl Slow parsing efficiency 10 times , The reason is right DWARF File analysis golang The implementation of version is better than llvm Of C++ Version is more time-consuming . Therefore, this scheme is not feasible .

3、 ... and 、 The ultimate solution

3.1 Overall scheme design

Later, it was found through monitoring , Reduced efficiency per parsing , When a large number of errors are reported , A distributed file system that stores symbol table files CephFS The read traffic is particularly high :
 picture
 picture
Then I realized that the real bottleneck of symbol parsing is The Internet IO, Because of the super voice and tiktok. App The symbol table file size of often exceeds 1GB, And there are a lot of internal test packages uploaded every day , Although the symbol table has a cache locally on the physical machine , But there are always some long tailed symbol tables that cannot hit the cache , During late peak hours, you need to synchronize from the distributed file system to the back-end container instance , At the same time, because symbol parsing is randomly distributed to a physical machine in the cluster , So it will magnify the problem :
The Internet IO The higher the flow rate , The slower the symbol parsing , Slower symbol resolution , The easier it is to accumulate , In turn, it may cause network damage IO Higher flow , Such a vicious circle may eventually lead to the complete tamping of the whole service .
Finally, we adopt the method of fully analyzing the mapping relationship between address and symbol in the symbol table file when uploading the symbol table , The ultimate solution of online direct query online cache :
 picture
Core modification point :
  1. The mapping of symbols and addresses is changed from finding the corresponding symbol table file at the time of crash to calling the command line to work analysis. Instead, the mapping relationship between all addresses and symbols is fully pre analyzed when the symbol table file is uploaded , The mapping relationship is then structured into storage , Just look for the cache when it crashes .
  2. In order to solve part C++ And Rust Symbol demangle Failure and various languages demangle Inconsistent tools . Will be original llvm Self contained demangle Tool replaced with a Rust Realization , Full language support demangle Tools symbolic-demangle( See resources for links [2]), Greatly reduces the operation and maintenance cost .
  3. Give priority to the new scheme for symbol analysis , The new scheme fails to hit the volume or the resolution of the new scheme fails. Use the old scheme to explain .

3.2 Scheme implementation details

3.2.1 Symbol table file format

DWARF

File structure

DWARF Is a debugging information format , Usually used for source level debugging , It can also be used to restore the symbols and line numbers corresponding to the source code from the runtime address ( Such as : atos).

Xcode Package if Build Options -> Debug Infomation format Set up DWARF with dSYM after ,Xcode Will generate a dSYM file , Which explicitly contains DWARF To help us according to the address , Find the method symbol and information such as file name and line number , It is convenient for developers to troubleshoot problems after the official release of the version .
We use AwemeDylib.framework.dSYM Medium DWARF File as an example , use macOS Under the file Command to observe its file type :
 picture
You can see from the above figure that ,DWARF Actually, too. Mach-O A type of file , So it can also be used MachOView Tools open analysis .
 picture
You can see it in the picture above Mach-O The type of document is MH_DSYM . Since it is Mach-O file , Use size Commands can be viewed AwemeDylib This DWARF The file contains Segment and Section, With arm64 Architecture, for example :
~/Downloads/dwarf/AwemeDylib.framework.dSYM/Contents/Resources/DWARF > size -x -m -l AwemeDylib
AwemeDylib (for architecture arm64):
Segment __TEXT: 0x18a4000 (vmaddr 0x0 fileoff 0)
        Section __text: 0x130fd54 (addr 0x5640 offset 0)
        Section __stubs: 0x89d0 (addr 0x1315394 offset 0)
        Section __stub_helper: 0x41c4 (addr 0x131dd64 offset 0)
        Section __const: 0x1a4358 (addr 0x1321f40 offset 0)
        Section __objc_methname: 0x47c15 (addr 0x14c6298 offset 0)
        Section __objc_classname: 0x45cd (addr 0x150dead offset 0)
        Section __objc_methtype: 0x3a0e6 (addr 0x151247a offset 0)
        Section __cstring: 0x1bf8e4 (addr 0x154c560 offset 0)
        Section __gcc_except_tab: 0x1004b8 (addr 0x170be44 offset 0)
        Section __ustring: 0x1d46 (addr 0x180c2fc offset 0)
        Section __unwind_info: 0x67c40 (addr 0x180e044 offset 0)
        Section __eh_frame: 0x2e368 (addr 0x1875c88 offset 0)
        total 0x189e992
Segment __DATA: 0x5f8000 (vmaddr 0x18a4000 fileoff 0)
        Section __got: 0x4238 (addr 0x18a4000 offset 0)
        Section __la_symbol_ptr: 0x5be0 (addr 0x18a8238 offset 0)
        Section __mod_init_func: 0x1850 (addr 0x18ade18 offset 0)
        Section __const: 0x146cb0 (addr 0x18af670 offset 0)
        Section __cfstring: 0x1b2c0 (addr 0x19f6320 offset 0)
        Section __objc_classlist: 0x1680 (addr 0x1a115e0 offset 0)
        Section __objc_nlclslist: 0x28 (addr 0x1a12c60 offset 0)
        Section __objc_catlist: 0x208 (addr 0x1a12c88 offset 0)
        Section __objc_protolist: 0x2f0 (addr 0x1a12e90 offset 0)
        Section __objc_imageinfo: 0x8 (addr 0x1a13180 offset 0)
        Section __objc_const: 0xb2dc8 (addr 0x1a13188 offset 0)
        Section __objc_selrefs: 0xf000 (addr 0x1ac5f50 offset 0)
        Section __objc_protorefs: 0x48 (addr 0x1ad4f50 offset 0)
        Section __objc_classrefs: 0x16a8 (addr 0x1ad4f98 offset 0)
        Section __objc_superrefs: 0x1098 (addr 0x1ad6640 offset 0)
        Section __objc_ivar: 0x42c4 (addr 0x1ad76d8 offset 0)
        Section __objc_data: 0xe100 (addr 0x1adb9a0 offset 0)
        Section __data: 0xc0d20 (addr 0x1ae9aa0 offset 0)
        Section HMDModule: 0x50 (addr 0x1baa7c0 offset 0)
        Section __bss: 0x1e9038 (addr 0x1baa820 offset 0)
        Section __common: 0x1058e0 (addr 0x1d93860 offset 0)
        total 0x5f511c
Segment __LINKEDIT: 0x609000 (vmaddr 0x1e9c000 fileoff 4096)
Segment __DWARF: 0x2a51000 (vmaddr 0x24a5000 fileoff 6332416)
        Section __debug_line: 0x3e96b7 (addr 0x24a5000 offset 6332416)
        Section __debug_pubnames: 0x16ca3a (addr 0x288e6b7 offset 10434231)
        Section __debug_pubtypes: 0x2e111a (addr 0x29fb0f1 offset 11927793)
        Section __debug_aranges: 0xf010 (addr 0x2cdc20b offset 14946827)
        Section __debug_info: 0x12792a4 (addr 0x2ceb21b offset 15008283)
        Section __debug_ranges: 0x567b0 (addr 0x3f644bf offset 34378943)
        Section __debug_loc: 0x674483 (addr 0x3fbac6f offset 34733167)
        Section __debug_abbrev: 0x2637 (addr 0x462f0f2 offset 41500914)
        Section __debug_str: 0x5d0e9e (addr 0x4631729 offset 41510697)
        Section __apple_names: 0x1a6984 (addr 0x4c025c7 offset 47609287)
        Section __apple_namespac: 0x1b90 (addr 0x4da8f4b offset 49340235)
        Section __apple_types: 0x137666 (addr 0x4daaadb offset 49347291)
        Section __apple_objc: 0x13680 (addr 0x4ee2141 offset 50622785)
        total 0x2a507c1
total 0x4ef6000
You can see that there is one named __DWARF Of Segment, It contains __debug_line , __debug_aranges , __debug_info And so on Section. We can use dwarfdump To explore DWARF What's in the paragraph , For example, enter a command dwarfdump AwemeDylib --debug-info Can be displayed __debug_info Section Click the formatted content . About dwarfdump The complete usage of the instruction can be referred to llvm Official documentation of the tool chain ( See resources for links [3]).
Reference resources 《DWARF File format official document 》( See resources for links [4]), these section The relationship between them is shown in the figure below :
 picture
debug_info
debug_info section yes DWARF The core information in the document .DWARF use The Debugging Information Entry (DIE) To describe this information in a unified form , Every DIE contain :
  • One TAG Attribute expression describes what type of element , Such as : DW_TAG_subprogram ( function )、 DW_TAG_formal_parameter ( Formal parameters )、 DW_TAG_variable ( Variable )、 DW_TAG_base_type ( The base type ).
  • N Attributes (attribute), Used to specify a DIE.
Here's an example :
0x0049622c:   DW_TAG_subprogram
                DW_AT_low_pc        (0x000000000030057c)
                DW_AT_high_pc        (0x0000000000300690)
                DW_AT_frame_base        (DW_OP_reg29 W29)
                DW_AT_object_pointer        (0x0049629e)
                DW_AT_name        ("+[SSZipArchive _dateWithMSDOSFormat:]")
                DW_AT_decl_file        ("/var/folders/03/2g9r4cnj3kqb5605581m1nf40000gn/T/cocoapods-uclardjg/Pods/SSZipArchive/SSZipArchive/SSZipArchive.m")
                DW_AT_decl_line        (965)
                DW_AT_prototyped        (0x01)
                DW_AT_type        (0x00498104 "NSDate*")
                DW_AT_APPLE_optimized        (0x01)
Some of the key data are interpreted as follows :
  • DW_AT_low_pc , DW_AT_high_pc Each represents the beginning of the function / end PC Address .
  • DW_AT_name The name describing the function is +[SSZipArchive _dateWithMSDOSFormat:].
  • DW_AT_decl_file It means that this function .../SSZipArchive.m Declaration in the document .
  • DW_AT_decl_line It means that this function .../SSZipArchive.m Document No 965 Line statement .
  • DW_AT_type Describes the return value type of a function , For this function , by NSDate*.
It is worth noting that :
  1. DWARF There are only a limited number of attributes , The list of all attributes can be referenced llvm api file ( See resources for links [5]) in DW_TAG The first part .

  2. DW_AT_low_pc and DW_AT_high_pc The machine code address described is not equivalent to the address of the program at run time , We can call it file_address. The operating system is based on security considerations , Will apply an address space layout randomization technique ASLR, Error loading executable into memory , Will do a random offset ( In the following, we use load_address Generation refers to ), After we get the offset, we need to add __TEXTSegment Of vmaddr To restore the runtime address .vmaddr You can go through the size Command or otool -l Command get . Be careful vmaddr Generally, it has a direct relationship with architecture , about armv7 Architecture is usually 0x4000, about arm64 Architecture is usually 0x100000000, But not absolutely , For example, here AwemeDylib Dynamic library symbol table arm64 Architecturally vmaddr Namely 0. We put the function in App The address of the runtime is called runtime_address.

The calculation formula between the above addresses is :
file_address = runtime_address - load_address + vm_address
CompileUnit
CompileUnit Translation is the compilation unit . A compilation unit usually corresponds to a TAG yes DW_TAG_compile_unit Of DIE. The compilation unit represents the compiled version of an executable source file __TEXT and __DATA Etc , Generally, it can be simply understood as a file involved in compilation in our code , for example .m,.mm,.cpp,.c And other source files corresponding to different programming languages . A compilation unit contains all the... Declared in the compilation unit DIE( Including method , Parameters , Variable etc. ). Take a typical example :
0x00495ea3: DW_TAG_compile_unit
              DW_AT_producer        ("Apple LLVM version 10.0.0 (clang-1000.11.45.5)")
              DW_AT_language        (DW_LANG_ObjC)
              DW_AT_name        ("/var/folders/03/2g9r4cnj3kqb5605581m1nf40000gn/T/cocoapods-uclardjg/Pods/SSZipArchive/SSZipArchive/SSZipArchive.m")
              DW_AT_stmt_list        (0x001e8f31)
              DW_AT_comp_dir        ("/private/var/folders/03/2g9r4cnj3kqb5605581m1nf40000gn/T/cocoapods-uclardjg/Pods")
              DW_AT_APPLE_optimized        (0x01)
              DW_AT_APPLE_major_runtime_vers        (0x02)
              DW_AT_low_pc        (0x00000000002fc8e8)
              DW_AT_high_pc        (0x0000000000300828)
Some of the key data are interpreted as follows :
  • DW_AT_language , It describes which programming language is currently used by the compilation unit .
  • DW_AT_stmt_list It refers to the line number information corresponding to the current compilation unit in debug_line section Offset in , In the next summary, we will introduce in detail .
  • DW_AT_low_pc , DW_AT_high_pc Here represent all... Contained in the compilation unit DW_TAG_subprogram TAG Of DIE The beginning of the whole / The end of the PC Address .
debug_line
By entering instructions dwarfdump AwemeDylib --debug-line You can see that debug_line section Structured data .
Then we searched the previous summary DW_AT_stmt_list , That is to say 0x001e8f31
debug_line[0x001e8f31]
...
include_directories[  1] = "/var/folders/03/2g9r4cnj3kqb5605581m1nf40000gn/T/cocoapods-uclardjg/Pods/SSZipArchive/SSZipArchive"
...
file_names[  1]:
           name: "SSZipArchive.m"
      dir_index: 1
       mod_time: 0x00000000
         length: 0x00000000
...
Address                                     Line  Column File   ISA   Discriminator     Flags
------------------------ ------   ------    --- -----  -------------  --------
0x00000000002fc8e8        46           0       1         0                         0    is_stmt
0x00000000002fc908        48          32       1         0                         0    is_stmt prologue_end
0x00000000002fc920         0          32       1         0                         0 
0x00000000002fc928        48          19       1         0                         0 
0x00000000002fc934        49           9       1         0                         0    is_stmt
0x00000000002fc938        53          15       1         0                         0    is_stmt
0x00000000002fc940        54           9       1         0                         0    is_stmt
...
0x0000000000300828  1058               1       1         0                         0    is_stmt end_sequence
include_directories and file_names The combination is the absolute path to the compiled file .
Then the following list is file_address Corresponding file name and line number .
  • Address: This means FileAddress.

  • Line: refer to FileAddress The corresponding line number in the source file .

  • Column:FileAddress The corresponding column number in the source file .

  • File: Source file index, With the above file_names The subscripts in are consistent .

  • ISA: Unsigned integer , It refers to which instruction set architectures the current instruction applies to , It's usually 0.

  • Discriminator: Unsigned integer , Indicates the ownership of the current instruction in the multi compilation unit , In the system of single compilation unit, it is generally 0.

  • Flags: Some marker bits , Here are the two most important ones :

    • end_sequence: Is the destination file machine instruction end address +1, Therefore, it can be considered that in the current compilation unit , Only end_sequence The address before the corresponding address is a valid instruction .
    • is_stmt: Indicates whether the current instruction is the recommended breakpoint location , generally speaking is_stmt by false The code may correspond to the compiler optimized instructions , The instructions in this part are generally line numbers 0, It interferes with our analysis of the problem , How to correct is discussed below .
Principle of symbolic analysis

For example, this line calls the stack :

5 AwemeDylib 0x000000010035d580 0x10005d000 + 3147136
Corresponding binaryImage yes :
0x10005d000 - 0x1000dffff AwemeDylib arm64
In the section of file structure, we can calculate the... Corresponding to the crash address through the formula file_address:
file_address = 0x000000010035d580 - 0x10005d000 + 0x0 = 0x300580
And then we use dwarfdump --lookup Instruction can find the corresponding method name and line number :
 picture
Let's describe it with a flowchart dwarfdump Principle of mapping from address to symbol (atos And other tools ):
 picture
You can see that in the end dwarfdump The result of the analysis is also completely consistent with the result of our manual human flesh analysis , The following figure 0x30057c~0x300593 The file name and line number resolved from this address range are exactly the same .
 picture
be based on DWARF Symbol parsing of the file. We expect the format of the parsing result to be :
func_name (in binary_name) (file_name:line_number)
With FileAddress 0x300580 For example , The result of our manual human flesh analysis is :
+[SSZipArchive _dateWithMSDOSFormat:] (in AwemeDylib) (SSZipArchive.m:965)
And then we use atos When the tool executes the command, the result of manual parsing is :
dwarf atos -o AwemeDylib.framework.dSYM/Contents/Resources/DWARF/AwemeDylib -arch arm64 -l 0x10005d000 0x000000010035d580 +[SSZipArchive _dateWithMSDOSFormat:] (in AwemeDylib) (SSZipArchive.m:965)
so atos It is also completely consistent with the results of our manual human flesh analysis .

Symbol Table

Last big chapter , We introduced through DWARF File to realize the principle of symbol parsing . However, this scheme cannot be covered 100% Scene . as a result of :
  1. If statically linked Framework Parameters will be compiled when packaging GCC_GENERATE_DEBUGGING_SYMBOLS Change to NO, So in the end App Generated during packaging dSYM The file will not have the file name and line number information corresponding to the machine instructions generated by this part of the code .
  2. For system libraries , It didn't provide dSYM file , All we have is .dylib perhaps .framework Equal format MachO file , for example libobjc.A.dylib , Foundation.framework etc. .
For no DWARF Symbols for files , We need another means : Symbol Table String For symbol parsing .
File structure
MachO In file Symbol Table Part in MachoView The format in the tool is as follows :
 picture
Interpretation of key information :
  • String Table Index: Namely String The offset in the table . Through this offset, you can access the specific string corresponding to the symbol , For example, the first one in the circle above symbol info The offset of is 0x0048C12B, Plus String Table From 0x02BBC360 , be equal to 0x304848B. After the query, it is _ff_stream_add_bitstream_filter.
 picture
  • value: The starting value corresponding to the current method FileAddress.
Principle of symbolic analysis
  1. Yes Symbol Table List value Sort .

  2. take value Arrange order well , Found just below value Of index, Then the crash information exists in index-1 In the data area of the subscript , Reuse index-1 In the subscript data area String Table Index You can go to String Table Index to the corresponding method name . then FileAddress - Of the target data area value Is the number of bytes offset from the crash address to the method start address .

be based on Symbol Table We expect the format of the parsing result to be :
func_name (in binary_name) + func_offset
With FileAddress 0x56C1DE For example , The result of our manual human flesh analysis is :
_ff_stream_add_bitstream_filter (in AwemeDylib) + 2
And then we use atos When the tool executes the command, the result of manual parsing is :
dwarf atos -o AwemeDylib.framework.dSYM/Contents/Resources/DWARF/AwemeDylib -arch arm64 -l 0x0 0x56C1DE ff_stream_add_bitstream_filter (in AwemeDylib) + 2
so atos It can also be considered to be completely consistent with the results of our manual human flesh analysis , The only difference is atos Removed compiler default to c Function plus _ Prefix .

3.2.2 Implementation of online pre parsing scheme

Golang Native implementation

Golang Using native system libraries debug/dwarf analysis DWARF file , It can be printed out very conveniently address Corresponding file name and line number , and Golang Naturally, it supports cross platform .
however Golang The native implementation of does not actually meet our needs , The main reasons are as follows :
  1. debug/dwarf There is no way to resolve method names directly api, This leads to incomplete parsing results .

  2. It is also not compatible with more complex scenarios such as file names and line numbers of inline functions .

  3. The implementation here is actually based on known FileAddress The premise of , The scheme of full pre parsing is not provided .

  4. Support only Dwarf File analysis , I won't support it Symbol Table Parsing .

So we still have to do it ourselves DWARF Document and Symbol Table Parsing .

Full pre parsing implementation

According to the above principle , The first thing we can think of naturally is : All we have to do is put __TEXT Segment Medium __text Section Possible address ranges are resolved one by one , Then it is stored in the back-end distributed cache, such as Hbase perhaps redis No, that's good ?
The answer is yes , But there is no need to .
 picture
From the picture above, we can see , Code snippet size yes 0x130FD54, Turn into 10 It's nearly 2000w The order of magnitude ! This is just a single schema for a single symbol table file , However, the symbol table of online stock of byte stability monitoring platform has hundreds of thousands of orders of magnitude , This level of storage consumes too much machine resources , It's obviously not realistic . Based on the principle of symbolic analysis, it is not difficult to find , For a continuous segment of addresses, their parsing results may be exactly the same . For example, we mentioned above , here AwemeDylib dSYM file arm64 Under the framework of 0x30057c To 0x300593 The results of this address range are +[SSZipArchive _dateWithMSDOSFormat:] (in AwemeDylib) (SSZipArchive.m:965) . So at least 20 Times the compression ratio , And this strategy is right DWARF The file or Symbol Table It is applicable to all .
So here comes the next question , We know AwemeDylib dSYM file arm64 Under the architecture 0x30057c~0x300593 The symbol resolution result corresponding to the address range is [SSZipArchive _dateWithMSDOSFormat:] (in AwemeDylib) (SSZipArchive.m:965). write in Hbase Medium value It's simple , We can put the lowest address in a range of addresses , The highest address , Method name corresponding to symbol resolution , file name , Information such as line number is encapsulated into a struct, Defined as value, We call it unit{}. that key And what is it ?
There is actually a thorny problem here : When pre parsing data storage, we store a range of addresses , But when parsing online, our input has only one address , So how to reverse deduce from this address Hbase Stored key Well ? The solution we give is :
hbase_key = [table_name]+image_name+uuid+chunk_index
Each part is explained as follows :
  • table_name: Used to distinguish dwarf and symbol_table Two types of .

  • image_name:binary Name , for example Aweme,libobjc.A.dylib etc. .

  • uuid: Unique identification of a symbol table file , Pay attention to the general dSYM Fat binaries for multiple schemas , And different architectures MachO file uuid Also different .

  • chunk_index: It refers to the continuous length as a constant N( Here we use 10000 For example ) The address space is divided into units , Calculate which subscript the current address can fall into , It can also be considered as the current address divided by a constant N Then round it down . It's very clear for a single address , But for a range of addresses, it's more complicated , If the lower and upper limits of an address range are divided by a constant N Round down the same words , They fall into the same subscript , But if it's different , In order to ensure that each address falling into this address range can be correctly parsed when reading , Therefore, the address range spans all chunk_index, The address range needs to be written .

Based on this policy , We Hbase Medium value, It can't be a single address range and the corresponding parsing result , It should be an array that falls into all address ranges in this range , Write it down as []unit{}. The schematic diagram is as follows :
 picture
We can see clearly , because 29001~41000 This address range spans 3 individual chunk_index, Because they were written at the same time Hbase In the three caches , Although there is a little redundancy , But it still takes into account the performance and throughput to the greatest extent . When querying the resolution result corresponding to the call stack address on the line, we just need to divide the offset address by the constant N Then round down to find out where the offset address falls chunk_index in , Then use dichotomy to find the first one just larger than this address unit_index, Move one more step forward to find the analysis results we need .
Be careful : Online priority query dwarf In the table Hbase cache , Put the method name , The file name and line number are spliced into the format we need ; If not, check again symbol_table In the table Hbase cache , And calculate the offset of the starting address of the distance function . In order to prevent some cold data from being used for a long time after the symbol table is uploaded , For each of the above images chunk Set up 45 Days past due , If there is a query online , Just update the chunk The expiration time of is after the current time 45 God .

DWARF File parsing

Total quantity CompileUnit analysis
From the basis of DWARF In the section on the principle of symbolic analysis of documents, we know , Whether the file name, line number or function name is resolved, it depends on CompileUnit, adopt DWARF Official documents we know all CompileUni t stay debug_info section The offset addresses in are saved in debug_arranges section in .
 picture
The above document also gives debug_arranges binary In the structure , Based on the structure in the document , We need to put all the debug_info_offset All manually resolved , Because of the space, we won't post code implementation here , One thing that needs special attention is binary When manually parsing, be sure to pay attention to the size end .
Address full resolution process
The following figure shows the process of full address resolution , We need to pay special attention to 3 Point is :
  1. The function name of the inline function is still subject to the declaration of the function , However, the file name and line number shall be subject to the inline location , This is related to atos The analytical results are consistent . Otherwise, two successive layers of call stack information may jump , Affect the efficiency of problem analysis .
  2. from 《DWARF File format official document 》 We can learn from , debug_line in Flags If there is one in that column is_stmt Words , Indicates that the current instruction is the breakpoint location recommended by the compiler , Otherwise, the corresponding instruction is the breakpoint position recommended by the compiler . Because breakpoints can only be typed on the same line , Then we can judge from is_stmt flag That line of instructions to the next one is_stmt flag The source file name and line number corresponding to these lines of instructions are exactly the same , So for no is_stmt flag That line of instructions , We just need to find the nearest one , And the address is smaller than it , And there are is_stmt flag That line of information , You can accurately obtain the file name and line number after the corresponding address resolution . So the conclusion is :debug_line The symbol of whether the line number information of several consecutive lines can be merged is is_stmt , Only two consecutive lines is_stmt by true Between debug line info Can be merged .
  3. Write here to Hbase The address range in refers to the offset address , The formula is :offset = file_address - __TEXT.vmaddr. In this way, you don't need to care about the corresponding DWARF Of documents __TEXT Segment From .
 picture

Symbol Table analysis

Symbol Table The analysis of is relatively simple , We just have to take Symbol Table Press for information in value Sort , Then write the start and end address of each part and the corresponding function name in accordance with the strategy in the above chapter Hbase that will do .

3.2.3 Step on a hole

In the process of implementing this scheme, various pits have been stepped on , Here are some typical examples , For your reference :
  1. Writing takes much longer than expected .

    Question why : In the writing Hbase It was called before demangle Tools , Each time there are dozens more ms Performance overhead , In the case of magnitude exaggeration, this problem will be magnified .

    Solution : take demangle The timing from Hbase Changed from before writing Hbase After the query , After all, there are still far fewer crash methods than full-scale methods .

  2. CompileUnit Acquisition failure .

    Question why : In most cases , from .debug_arranges section Out of compile unit offset You need to add one manually 0xB The offset is exactly what we expected CompileUnit The migration .
     picture

    But here case An accident occurred :
     picture First of all, we see that its offset is not 0xB, And from debug_arranges section Out of compile unit offset It is directly correct , The reason is unknown for the time being .
    Solution : Make a compatible , If you add 0xB Of offset take compile unit Are wrong , Then subtract 0xB Try again .

  3. debug_line Two as like as two peas in the same place , The result of parsing is ambiguous .

    Question why : Although two consecutive lines have the same address , But the file name and line number are inconsistent , This leads to ambiguous results .

     picture

    Solution : Reference resources atos The resolution result of , Subject to the previous behavior .
  4. debug_line Already read end_sequence That line is the last line , But the current CompileUnit There is another part TAG by DW_TAG_subprogram Of DIE Has not been debug_line Any address in is indexed to . Then this part of the address range is omitted .

    Question why : Suspected compiler optimizations , This part DIE Method names are generally _OUTLINED_FUNCTION_ start .

    Solution : If it has been parsed end_sequence That line , At present CompileUnit also TAG by DW_TAG_subprogram Of DIE Not indexed , So this part DIE The file name and line number corresponding to the address range are end_sequence File name and line number of this line .

  5. Symbol Table Illegal data in .

    Question why :Symbol Table Of this data in FileAddress It's better than __TEXT.vmaddr Even smaller , This leads to offset It's negative , And because at the beginning, we defined the address offset as uint_64 type , Lead to offset Is strongly converted to a particularly large integer , Fall short of expectations .

     picture

    Solution : Filter out data segments with negative address offset .

Four 、 Online effect

This solution is before the full launch AB Tested about 2 Around the week , Fixed all known problems with the old scheme diff Of badcase. The performance of each performance index after full online is as follows :

4.1 Single line parsing time

 picture

7.7 10:46 lately 6h Average time optimized  70 times ,pct99 300 Many times

4.2 crash Overall time-consuming interface

from 7.7 To 7.10 crash The overall average time spent parsing interfaces has decreased 50%+.
 picture
from 7.7 To 7.0 crash Parsing the interface as a whole pct99 Time consumption has decreased 70%+.
 picture

4.3 Symbol table file access level

from 7.7->7.10 The magnitude of daily symbol table file access is reduced  50%+.
 picture

4.4 Parse error

From the beginning of heavy volume 7.7 The start , The analysis error has completely disappeared .
 picture

4.5 Physical machine performance

Select a representative physical machine on the line to monitor , The machine load can be seen , Memory footprint ,CPU Occupy , The Internet IO There are very obvious year-on-year optimization .

Below, some core indicators are intercepted and compared with the indicator Kanban before and after optimization :

  • Time range before optimization : 7.3 12:00 - 7.5 12:00

  • Optimized time range : 7.10 12:00 - 7.12 12:00

15min load

 picture
 picture
15min Load average :5.76 => 0.84, It can be understood that the overall resolution efficiency of the cluster is improved to the original  6.85 times .

IOWait CPU Occupy

 picture
 picture
IOWait CPU Occupancy average :4.21 => 0.16, Optimize  96%.

Memory footprint

 picture
 picture
Average memory usage :74.4GiB => 31.7GiB, Optimize 57%.

The Internet Input Traffic

 picture
 picture
The Internet Input Traffic :13.2MB/s=>4.34MB/s, Optimize 67%.

Reference material

[1] https://my.oschina.net/linker/blog/1529928

[2] https://docs.rs/crate/symbolic-demangle/8.3.0

[3] https://llvm.org/docs/CommandGuide/llvm-dwarfdump.html

[4] http://www.dwarfstd.org/doc/DWARF4.pdf

[5]http://formalverification.cs.utah.edu/llvm_doxy/2.9/namespacellvm_1_1dwarf.html#a85bda042c02722848a3411b67924eb47

About byte terminal technology team

Byte hop terminal technology team (Client Infrastructure) It is a global R & D team of large front-end basic technology ( In Beijing 、 Shanghai 、 Hangzhou 、 Shenzhen 、 Guangzhou 、 Singapore and mountain view have R & D teams ), Responsible for the whole front-end infrastructure construction , Improve the performance of the company's entire product line 、 Stability and engineering efficiency ; Tiktok products include, but are not limited to, the jitter 、 Today's headline 、 Watermelon Video 、 anonymous letter 、 Guagualong, etc , In mobile terminal 、Web、Desktop And other terminals have in-depth research .
The time is now ! client / front end / Server side / Terminal intelligence algorithm / Test Development Global recruitment ! Let's change the world with technology , Interested please contact chenxuwei.cxw@bytedance.com, Email subject resume - full name - Employment intention - Expect the city - Telephone .

版权声明
本文为[Byte hopping terminal technology]所创,转载请带上原文链接,感谢
https://cdmana.com/2021/10/20211002145640783R.html

Scroll to Top