IOS no buried point data SDK practice road

This article is based on the joy of NetEase buried point data SDK summary. SDK is responsible for the collection of data points have been developed for more than half a year, during which the group has been related to share, and now it is time to get out and peer exchange. This article focuses on the overall implementation of SDK ideas and key technical points.

SDK already has the ability to automatically, dynamically, fully and correctly collect all event data in the user’s App when the code is not required. In addition, it has also developed a separate circle with the selection of SDK, App can be completed at the end of the interface elements of the ring and KVC configuration upload. The interface element can be assigned to the work of research and product personnel to do to reduce the workload of developers.

SDK existing functions can be divided into two parts:

  • The basic event data collection: the basic event refers to the collection of cold start events and application events, user events, click on the page ScrollView sliding events, this part is all done automatically, realize the idea will be introduced in the first section.
  • The business layer data collection: business layer data collection refers to some data associated with business functions such as: when the user clicks the button to submit orders, collect the user purchased items and the total amount of order data. This business layer data collection in the past most of the way through the code buried point to do, this SDK is really the realization of the non point of access to these desired business data. This part of the implementation will be detailed in the second section of this article.

The overall realization of SDK

SDK the overall use of AOP (Aspect-Oriented-Programming) is the idea of aspect oriented programming, is a dynamic function before and after the insertion of data collection code. The implementation of Objective-C is based on the Runtime characteristics of Method Swizzling black magic.

SDK data collection function mainly through the Method Swizzling to the corresponding method of hook. The method of hook can be divided into 3 categories: the method of system class, the Delegate method of system class, the method of custom class.

Method of system class

The method of the system class refers to the basic classes provided in the system framework, such as UIApplication, UIViewController, etc.. SDK in the implementation of certain functions, the need for hook these classes of methods. For example, in the realization of the collection of page events, the main hook of the life cycle of the UIViewController method: viewDidLoad, viewDidAppear, viewDidDisappear, dealloc

Delegate method for system class

System class Delegate method mainly refers to the UIKit framework provided by the Delegate method, such as UIScrollViewDelegate, UITableViewDelegate, UIWebViewDelegate, etc.. Most of the features in SDK are done by means of the hook protocol. For example, in the realization of the list element click event collection, the main hook UITableViewDelegate in the tableView:didSelectRowAtIndexPath: method.

Custom class method

As the name suggests, a custom class method is defined by the developer in the project of their own definition of the class, rather than the system class methods. Some of the features of SDK are implemented by means of hook classes. For example, in SDK to achieve the operation of the gestures of the event collection, the need for the hook gesture object specified in the action target method, and target is usually a custom class. In fact, hook system class delegate method can also be regarded as a hook custom class method, because the system class delegate methods are mostly in the custom class.

This is part of it with the help of AOP to add data collection code, but in real time to do, did not think so simple, involves many details, such as how to make a navigation bar and click events belonging to the appropriate system pop-up page, how to distinguish the UIControlEventValueChanged event and solve performance problems cause hook gestures and so on. But this part of the article is not the focus of this article, so here is not going to say more, then will write an article to talk about some of the pit.

The realization of the key technology of SDK

Generation and optimization of viewPath and viewId

In order to carry out data collection, statistics and analysis of a view in a APP page, we need to be able to uniquely identify and locate this view, which can be said to be an important prerequisite for data collection SDK. So how do you identify the only view in APP? SDK used in viewPath and viewId to complete.

1 viewPath composition

In fact, the entire APP view structure can be seen as a tree (viewTree), the root of the tree node is UIWindow, the tree’s branches by UIViewController and UIView composition, the tree’s leaf nodes are composed of UIView.

So what information is used in the viewTree to indicate the location of any view? It is easy to think of the use of the target view to the root of each node in the depth (level) to form a path, and the depth of the node (level) refers to the node in the parent node in the index. This is the only way to express this view, but there is a drawback: it is very poor readability. Therefore, the name of each node is added, and the name of the node is represented by the class name of the view of the current node.

Thus, in viewTree, the information from each node of a view to the root node is composed of the view (viewPath). In addition, due to the statistical analysis of doing view, are based on the page as a unit, so SDK in the generation of viewPath, only to the UIViewController level of view, rather than the root of the UIWindow. This also reduces the length of viewPath to some extent.

2 UITableViewCell/UICollectionCell depth representation

In App development, the most commonly used and most important control is UITableView and UICollectionView. For this reusable view, which will contain a lot of Cell, and the number of Cell is not sure, then each of the Cell should be how to express its depth? The answer is indexPath. Although each Cell may be reused, but different Cell corresponds to a unique indexPath, so you can use the indexPath value to indicate its depth.

3 viewPath representation and example

We already know that viewPath is made up of the class name and depth of each node, and then use this information to represent viewPath. Here is a specific example to say briefly, I casually find a project:

The name of each node in the path is:

HYGHallSlideViewController-UIScrollView-HYGHallProductTableView-UITableViewWrapperView-HYGHallProductCell-UITableViewCellContentView-HYGHallProductView.

The depth of each node in the path is: 0-0-1-0-0:2-0-1

The next step is to put the two together to form the viewPath, SDK, as follows:

ViewPath:HYGHallSlideViewController-UIScrollView-HYGHallProductTableView-UITableViewWrapperView-HYGHallProductCell-UITableViewCellContentView-HYGHallProductView & 0-0-1-0-0:2-0-1

In fact, is the use of & connector simple splicing together. This can facilitate the combination and separation of the two, so that the back of the viewPath match. In addition, there is a similar online xPath expression:

HYGHallSlideViewController[0]/UIScrollView[0]/HYGHallProductTableView[1]/UITableViewWrapperView[0]/HYGHallProductCell[0:2]/UITableViewCellContentView[0]/HYGHallProductView[1]

But personally feel that the way xPath is a little more complicated, and the combination of the split are relatively troublesome. However, the form of viewPath is secondary, we can follow their favorite way to express on the line, do not have to tangle in which form is better.

4 optimization for viewPath

4.1 optimization of the depth of the node calculation

As mentioned above, when calculating the depth of each node, the index value of the current view is used in all sub view in its parent view. However, in the actual development, viewTree sometimes change according to the user’s operation. Still holding a chestnut:

  • Assuming a UIView has three sub view, has joined the order is: label, button1, button2, in accordance with the previous calculation, the depth of these 3 sub view are: 0, 1, 2. When the user clicks a button, label1 is removed from the parent view. At this time UIView only 2 sub view:button1, button2, and the depth to: 0, 1. As shown in the figure:
IOS no buried point data SDK practice road

It can be seen that just because one of the view is removed, the depth of the other sub view changes. Therefore, in order to add / remove SDK in a view, to minimize the impact on the existing depth of view, adjust the calculation method of node depth: the current view is located in the same type of view view all his father in index.

Let us look at the above example, the initial label, button1, button2 depth are: 0, 0, 1. After the label was removed, the depth of button1 and button2 were: 0, 1. As can be seen, in this example, the removal of label did not affect the depth of button1, button2, this adjustment to a certain extent, the calculation of the way to enhance the anti-interference of the viewPath.

In addition, the calculation of the adjusted depth is dependent on the type of each node, so, at this point must be the name of each node in the viewPath, and not just to increase readability.

4.2 viewPath optimization for Swift

As everyone knows, the Swift file in access to the class name, will automatically add this file to the Module prefix: if the Swift file in the main project, it will add a project name; if it is in a component, and the project opened the use frameworks option will be added! The component name. In general, with swift project (including mixed pure swift/OC and swift), the viewPath will contain the Swift files of ModuleName, so in the following circumstances:

  • A OC file is rewritten using Swift
  • A Swift file is moved from the main project to a component library, or from the library to the main project
  • The main project switches between opening and closing use frameworks when referencing component libraries

In these 3 cases, the class name of the file will be changed due to ModuleName, which will lead to changes in the viewPath, the structural adjustment of the project file may have a direct impact on the viewPath.

In the actual development, especially for older OC projects, the project’s OC file is often rewritten using Swift. Therefore, it is necessary to avoid SDK viewPath because of this kind of situation and change.

In fact, the solution of this problem is very simple, since it is due to changes in the class name of the ModuleName prefix, then simply in the generation of viewPath, remove all the Swift prefix ModuleName. This approach can solve the influence on viewPath, but careful people may be aware of another hidden problem: if the different components in the library, two different view or controller with the same name (in Swift is allowed, because there were distinguished in this case, Module), viewPath there cannot be distinguished?

In fact, after careful consideration, this concern is a bit redundant, because even if the name of the view or the controller in the two Module, but view the structure of them will be different, and the depth is not the same, viewPath is not exactly the same.

4.3 in the calculation of the depth of VC, including the sub VC

As mentioned earlier, viewPath is expressed only to the nearest VC of view, and the depth of the VC is also the depth of all view in the parent view of this VC’s view. In the actual development of iOS, it is possible to use addChildViewController: to add multiple VC to implement complex pages, but in the case of VC, there may be problems in the depth calculation of VC. Or a simple chestnut:

  • Suppose a containerVC contains 4 sub VC:VC1, VC2, VC3, VC4. In each sub VC for the first time to be displayed, the child will be the first add VC, and sub VC view will also be add to a scrollView. At this time the VC for the first time in several different sub view order will lead to changes in their depth of view: if the order is: VC1, VC2, VC3, VC4, then their depth is as follows: VC1 (0), VC2 (1), VC3 (2), VC4 (3); see if the order is: VC3, VC1, VC4, VC2, depth becomes: VC1 (1), VC2 (3), VC3 (0), VC4 (2). This situation leads to unreliable viewPath and cannot guarantee uniqueness.

SDK in order to solve the above situation, adjust the depth of the VC calculation: no longer use the depth of its view, but the direct use of fixed 0. Because VC is already the root level of viewPath, its depth information is no longer important.

But this raises another problem, if the VC VC1 and VC2 are different instances of the same class, then the view of their internal structure is exactly the same, if the use of VC fixed depth (0), the viewPath will not be able to distinguish which specific sub VC view. For different instances of the same class, if you want to further distinguish between them, SDK uses another scheme: page alias.

5 viewId generation

ViewPath has been able to uniquely identify a view, why do you need viewId? In fact, the main reason is: the length of the viewPath is not fixed, and generally will be relatively long, not easy to use it as the background of the view’s only logo. So SDK uses viewPath information to generate a fixed length value as viewId by MD5 encryption.

6 viewPath and viewId repeat solution

After the optimization of the viewPath, SDK has been able to ensure the stability of the viewPath as much as possible. But it does not mean that only viewPath can be used to distinguish all click events. Sometimes the same viewPath view has different forms and functions, such as the following:

  • The same button in different states, display different text. For example, a button displays “add” before an item is not added; when the item is added, it is immediately displayed as “clear””
  • The same view with multiple click events, such as SegmentControl, UISwitch, UIStepper, etc.

The above 2 cases are the same viewPath corresponding to a number of events, at this time if the use of viewPath can not be divided into different states or events.

For this type of problem, SDK’s solution is: viewPath + other information”. The “other information” here is different depending on the situation, for example: in the above case 1, “other information” is the title of the button. In case 2, “other information” is the value of the isOn property of the selectedIndex and UISwitch of SegmentControl. SDK in the data collection, will upload the view of this information, combined with the circle of SDK can be selected to make the statistical time zone to distinguish these different events.

On the other information, and then add a little, in addition to SDK prior to know the information to be obtained, there is a class of business data. For example: a product list page, each row of a commodity, if you click on each line is not in the list to the background statistics, but each commodity click, then the “other information” should be productId. SDK on the business layer data acquisition and reporting please see the following introduction.

The realization of data collection for SDK without buried points

After finishing the viewPath, the next detailed introduction of another key technology under SDK: viewPath and KVC based on the realization of the SDK no point of business data collection function. First, a simple analysis of the shortcomings of the traditional code buried points, roughly the following:

  • Embedded code and business logic code mixed together to increase the cost of maintenance code;
  • Buried point code need to follow the release of the APP version, delayed data collection and statistics;
  • There are some problems such as buried, buried, and so on;

In order to solve the above code buried point defects, SDK realized the true meaning of the buried point to collect business data.

1 no buried point implementation architecture

The realization of SDK’s non buried point function mainly depends on viewPath and KVC. ViewPath has been introduced before, it is mainly used to identify a viewTree in view. The KVC for iOS developers are not unfamiliar, called iOS development of one of the dark magic. Through KVC we are able to access the properties of an object directly through key or keyPath without the need to invoke explicit access methods. If you do not quite understand the KVC, please learn by yourself, here is no longer too much exposition.

So how to achieve without the need to bury the code can be arbitrary access to the desired business data? Let’s look at the overall architecture of the SDK’s no embedding technology:

IOS no buried point data SDK practice road

As can be seen from the figure, in the realization of the SDK no buried point data collection, mainly divided into 3 steps: upload KVC configuration, request KVC configuration, business data collection and reporting.

2 what is the KVC configuration

In the figure above the KVC configuration, then the following is a brief introduction to what is the KVC configuration. In fact, some of the KVC configuration is used to describe the timing of the App to collect what data information, including the main information:

  • AppKey: which application is used to identify
  • AppVersion: used to identify the version number of the application
  • ViewEvent: identifies an event type (collection timing), such as: ButtonClick, ListItemClick, ViewTap, etc.
  • ViewPath: target view information in viewTree
  • KeyPath: the association path between the target view and the business data to be collected for the KVC value
  • KeyName: define a key for the business data to be collected, and ultimately form the key-value report. Used to distinguish multiple data collected

3 KVC configuration upload and send

  • Upload KVC configuration using the ring SDK upload KVC configuration operation for the user is transparent, mainly by developers to upload and management. This operation can be carried out at any time, want to collect business data in one or some version of App in KVC, upload the corresponding configuration information to the background can be achieved according to the dynamic coordination effect.
  • Request the KVC configuration SDK will trigger the KVC configuration at the beginning of the request operation, pull back from the background of the current version of the App corresponding to all KVC configuration, and the results of the request cache, to provide the next step.

4 business data collection and reporting

This part is the core of the SDK technology, then describes in detail the logic of this part. Its implementation process is as follows:

IOS no buried point data SDK practice road

The core of this link is based on the view viewPath matching, the main achievement is to iterate through the viewPath of each node of the information and the current view and its parent view in order to match. Therefore, this step will produce a certain amount of time and performance consumption. In order to minimize this part of the operation, SDK used a number of ways to optimize, one of which is based on the optimization of cache view.

4.1 optimization based on cache view

SDK uses a cache to match the success of the view information in a way to reduce the number of unnecessary viewPath matching operation. Here are the main cached view information:

  • TargetView: the last time a successful view object was matched by viewPath.
  • IndexPath: the last time viewPath matched the successful view by indexPath, if not nil.
1 viewEvent matching

The first step is to match the event type. If the KVC configuration information specified by viewEvent is ButtonClick, then you can easily filter out ListItemClick, ViewTap and other events. This step can filter a large portion of the event, and only the event type matches the success of the next step.

2 targetView matching

The next step is to compare the cached targetView with the current view. If both point to the same object, then the third step, or directly into the fourth step

3 indexPath matching

Some people may not understand why to add this step? In fact, this step is also very important, is the complement of the second step, mainly used to deal with the situation of Cell reusability.

If the second step in the cache targetView is Cell or Cell in a subview, then the success of the second step, and can not guarantee that the current view is what we really want to match the view. This may not be easy to understand, or give a simple example to illustrate:

  • If there is a Cell in a button, when the first row of button is clicked, the viewPath match is successful, then targetView caches the button object of the first row. The next slide down the list, the first line is drawn into the tenth line screen, screen, and tenth for reusing first lines of Cell, then click on button to match, due to Cell multiplexing, targetView and the button must refer to the same object, but it is not what we really want to, first lines of button. Can be seen: in the case of Cell multiplexing, can not determine the results of the second step must be correct.

Therefore, in the second step on the basis of the addition of the indexPath match. IndexPath, the logic is: if the cache is not nil indexPath and current view indexPath are not equal, then enter the fourth step; otherwise it is indicated that the current view is just the last matching success, it is not necessary for viewPath matching, can directly go to the fifth step.

4 viewPath matching

This step is to the current view and its parent view and KVC configuration of each node in the viewPath matching. Because it is a cycle of operation, so there will be a certain amount of time consumption, in fact, in this part of the match, also made some simple optimization. Before entering the loop, the following 3 steps:

  • Determine whether the view class name is equal;
  • Determine whether the viewController class name is equal to view;
  • Determine whether the window class name is equal to view;

The above 3 judgments can also filter a lot of unnecessary matching. Only these 3 judgments are passed, the viewPath cycle matching.

5 KVC value and report

At this point, it has been verified that the timing of data collection is correct. Next, you can use the KVC configuration information in the keyPath call valueForKeyPath: method to obtain the corresponding value. If the value is not nil, make a key pair with the keyName and report it to the current event data. This background can be found through the key to the corresponding business data.

The above is just a brief introduction to the logic of the match, in the actual development will be added to the cell indexPath with the situation of the deal, as the article space here is no longer explain in detail.

5 increase the exception handling for KVC

The realization of SDK’s non buried point function is mainly dependent on KVC, but it is well known that KVC is very dangerous and can easily cause the program to crash. For example, once the key or keyPath corresponding to the property name does not exist, it will immediately cause the program to throw a NSUndefinedKeyException exception, if the application does not deal with this exception, the program will Crash.

Therefore, in order to avoid the program Crash, SDK internal KVC exception handling. The specific implementation is to add a Category NSObject, rewrite the valueForUndefinedKey: method, and in the method return nil.

@implementation NSObject (KVCExceptionHandler) - (nullable ID) valueForUndefinedKey: (NSString *) key {return nil;} @end

Other key technologies

Of course, there are many key technical points in the implementation of SDK, such as: SDK RN page data collection, the realization of the page alias scheme, Method Swizzling and Aspects compatible. Because of the length of this article has been very long, and taking into account the patience to read the article will not be too long, so here do not explain, and then write the article will be introduced separately.

END

The article wrote so much, in fact, mainly introduced the two key technical points in the SDK, I hope you can have some reference value. In addition, if there is a better proposal for this article, welcome to discuss learning.

Finally, I would like to thank my colleagues, Wang Jiale, because of his article typesetting and proofreading work, so that this article can be better demonstrated to everyone. At the same time, I would like to thank all the colleagues in the group, I encountered difficulties in the development, gave me a lot of help.

Q & A

Some of the issues raised in this article will be recorded here (except in the book review), and a unified solution.

Q1: SDK use KVC configuration to obtain business data, whether it will increase the maintenance of KVC configuration work?

A1: will have the maintenance and management of KVC configuration, but SDK also simplifies the management of this work.

In general, all of the uploaded KVC configuration needs to correspond with the version of App, because the App version will lead to different keyPath may not be the same. So the work related to KVC configuration is as follows: 2:

  1. For the current version of the App upload the corresponding KVC configuration, in order to obtain the desired business data
  2. When a new version of App is released, you need to verify the previous version of the KVC configuration, whether it is still applicable to the new version. If applicable, the new version number is added directly to the KVC configuration on the management background; if it is no longer applicable, a new KVC configuration is then uploaded to the new version.

As can be seen from the above, in the App version of the iterative process, KVC configuration will be more and more, the corresponding maintenance and management work is also quite cumbersome.

In order to solve this pain point, SDK added a program to avoid duplication and tedious work. The specific scheme is:

  • When you upload a KVC configuration, specify a version of an interval, or do not specify a specific version (that is, apply to all current versions);
  • SDK failed to use the KVC configuration to obtain business data, add a related error log, and reported up. Which contains the error log appKey, appVersion, keyPath and other information, so that you can clearly see the background in which KVC configuration in which App version of the problem;
  • Using scripts to monitor KVC related error logs. If there is an error log report, send an email to the person concerned;

Therefore, the use of this SDK program optimization, KVC configuration management work is only 1:

  • According to Log information quickly find the corresponding KVC configuration, and upload a new version of the KVC configuration

How can Q2: achieve data collection and statistics when the content and location may change over time?

A2: uses SDK and SDK to complete dynamic data collection and statistics

This problem is also more common in the actual product, such as the contents of the App home page mostly through the background configuration.
this problem can be transformed into or decomposed into the following 2 cases:

  • The same location will display different content
  • The same content will be displayed in different locations

Note that the 2 are not the same, they correspond to different scenarios, and the data collection scheme is also different.

In addition, the “location” can be either in a list or in a non list, but this does not have much impact on the overall scheme, except for the different positions of the wildcard characters in the viewPath when it is not in place.

A2.1 display different content in the same position

Example: in the App home page has a display of the recent activities of the location, the first show 1 of the picture, over a period of time, the operator and into the activities of the picture of 2. How to count the activity 1, activity of each of the 2 hits?

In response to this scenario, SDK’s solution is: “care location” + “concern content”.
“place of interest” means that only the current position is used. The specific performance is that viewPath does not contain any wildcard characters.

The whole process can be divided into 3 parts:

  • Circle select SDK upload care location KVC configuration. KVC configuration specifies the keyPath to get active url.
  • Data SDK in the event of a click, collect the current activity corresponding to the URL, and follow the click event report.
  • Circle select SDK upload “care position” + “content” circle selection configuration, the content of the content is specified as the URL value of the activity.
A2.2 displays the same content in different locations

Example: App home has 4 fixed entrance, assuming one called “hot”, then according to the background configuration in different order, “the recommendations” can be displayed in any of the 1 in the 4 position in a period of time, which is displayed in first, over a period of time may be displayed on the second position. Then how to count the “popular recommended” click?

In response to this scenario, SDK’s solution is: do not care about location + content”.
“don’t care about position” is that viewPath contains wildcards, used to indicate the viewTree in a number of locations. For example, when you want to match all rows in a list, indexPath is replaced by a wildcard in viewPath.

The solution of this problem is divided into the following 3 steps:

  • Circle SDK upload “do not care about location” KVC configuration. The KVC configuration specifies the title of the keyPath to get the entry.
  • When the data SDK is clicked at any one of the 4, it collects the title of the portal and reports it along with the click event.
  • Circle SDK upload “do not care about location” + “content” circle selection configuration, the contents of the specified as “popular recommendation”.

Here, the data collection and selection of the configuration has been completed, and then the background data statistics.
the above 2 conditions for the background statistics are not the same, are using a statistical program, here is also about the background of statistical thinking about:

  • To upload the third step in circle configuration, generates a regular expression based on viewPath and “concern”, to match the original data and regular reporting from the SDK data, and the statistics of the corresponding data.