Early-stage phishing detection on the Ethereum transaction network

As cryptocurrency is widely accepted and used, attendant illegal activities have attracted extensive attention, especially phishing scams, which bring great losses to both customers and countries. From the perspective of crime prevention, early warning of such illegal behaviors is of great significance. However, most existing studies focus on detecting phishing scams that have already occurred and been reported. In addition, previous studies ignore the temporal order of users' appearance and thus cannot accurately extract features reflecting users’ transaction patterns. In this paper, we propose a framework called early-stage phishing detection to address the problem of early phishing detection. According to the phishing amount, we first divide the process of phishing scams into three stages: early stage, middle stage, and late stage. Then, we develop a feature extraction method to capture features from both the local network structures and the time series of transactions. In experiments, the dataset is strictly partitioned by time series, and experimental results show that our proposed method outperforms existing graph embedding methods on a real-world Ethereum transaction dataset. Finally, we select the ten most important features and analyze the differences between phishing users and normal users on these features, which provide useful insights for regulators and platforms to detect phishing scams in advance.


Introduction
Blockchain is an open and distributed ledger technology that breaks through the limitations of traditional centralized technology (Lakhani and Iansiti 2017). Based on blockchain techniques, cryptocurrencies have attracted extensive attention globally. As of March 11, 2021, the total market value of global cryptocurrencies exceeded $1.69 billion. Among them, Bitcoin accounted for 61.35%, ranking first, and Ether accounted for 12.12%, ranking second. 1 One of the main reasons for the popularity of cryptocurrencies is their unique advantage-decentralized peer-to-peer payments, which can eliminate the additional operating costs charged by intermediaries. However, cryptocurrencies also face risks. In recent years, many illegal behaviors have emerged in cryptocurrency. The ''Crypto Crime Report 2020'', 2 issued by the blockchain organization Chainalysis, pointed out that phishing attacks account for a high proportion of total cybercrimes, becoming one of the main attack methods for fraudulent activities.
Anomaly or fraud detection is an important research area (Gao et al. 2020;Wolsing et al. 2022). Phishing is the fraudulent attempt to obtain sensitive information or data, such as usernames, passwords, credit card details, or other sensitive information, by impersonating oneself as a trustworthy entity in digital communication (Ramzan, 2010;Van der Merwe et al. 2005). Unlike traditional phishing scams that rely on emails and websites, cryptocurrency phishing is generally more diverse. The anonymity of cryptocurrency allows phishing users to boldly use multiple channels to carry out phishing scams. For example, in July 2020, hackers gained access to more than a dozen high-profile Facebook accounts, including those of Bill Gates and Elon Musk. After taking over the accounts, the hackers posted a message, using the double return as bait to allow users to send cryptocurrency funds to the designated account address. Therefore, traditional detection methods are no longer suitable for cryptocurrency phishing detection.
Due to the open and transparent nature of blockchain technology, we can obtain all transaction data of users, which provides us with the possibility to detect phishing users by mining their transaction manners. Furthermore, with the development of machine learning, various datadriven based algorithms are applied to identify suspicious activities in fintech applications with satisfactory results (Stojanović et al. 2021).
In the past few years, several methods have conducted network modeling analyses on public transaction data, especially through the Ethereum transaction network (Chen et al. 2020b;Ferretti and D'Angelo 2019;Guo et al. 2019;Lee et al. 2020;Li et al. 2020;Victor and Lüders 2019;Zheng et al. 2020). These studies construct cryptocurrency transaction data into graphs, where nodes represent accounts and edges represent transactions.
To detect phishing users on graphs, most of the existing studies use feature extraction or graph embedding methods to obtain input node features during the data processing procedure. And, taking the available user labels, supervised learning methods can achieve satisfactory detection performance after training. However, there are several limitations of these works. (1) There is a danger of data leakage. Most of these works directly extract features and build training and testing sets without considering the temporal order of node appearance, which means that node features may contain future information, and the future nodes may be in the training sets; (2) although the graph embedding method can deal with the network efficiently, it has high training cost, weak explanatory features, and high dependence on the network structure. This means that the whole model has to be retrained when new users join the transaction network; (3) existing feature extraction methods do not fully mine the structural and time series information of transaction networks. They either focus on structural information or transaction amount information; (4) and, existing research has limited application value for early phishing detection (detecting phishing users before they are reported). From the fraud prevention perspective, the detection method not only needs to identify fraud that has occurred but also needs to have early warning capabilities (Chang and Chang 2012).
In this paper, we propose an early-stage phishing detection (EPD) framework to address these issues. Specifically, we develop a local network-based and time series-based feature extraction method (LTFE) that captures features from both the local network structures and the time series of transactions. Then, we adopt a time series splitting method to split the dataset into training and testing sets, and divide the process of phishing scams into early stage, middle stage and late stage according to their phishing amount. Finally, we employ a logistic regression (LR) model to detect potential phishing users. As shown in Fig. 1, the EPD framework can be simplified into three components: a data processing component, a feature extraction method, and a phishing detection model that detects suspicious nodes.
In summary, the major contributions of our work are as follows: 1. To the best of our knowledge, this is the first work that considers the fraud prevention problem on the Ethereum transaction network. We divide the phishing scams into different stages according to their phishing amount and then conduct early-stage phishing detection tasks. 2. The features extracted by LTFE effectively explore the local network structure and transaction time series, which can be well explained and help us understand the behavior of phishing accounts and study their patterns. 3. Compared with existing embedding methods, we avoid the problem of data leakage, and statistical tests demonstrate that the proposed method can achieve better performance on the early-stage phishing detection tasks.
The rest of this paper is organized as follows: Section 2 summarizes the related work. Section 3 introduces the definitions and preliminaries. Section 4 describes the detection framework. The experiments are detailed in Sect. 5. Finally, we conclude the paper in Sect. 6.

Traditional phishing detection
Phishing can be traced back to 1996 owing to social engineering attacks against America Online (AOL) accounts by online scammers (Khonji et al. 2013). With the growth of online businesses, such as e-commerce, phishing scams have become a considerable threat to financial security. Phishing practitioners impersonate a website of an honest firm to obtain users' private information. In response to these phishing attacks, researchers have conducted extensive research on phishing identification.
In the initial stage of phishing detection, the detection methods are mainly based on list recognition and comparison of similarity (Han et al. 2012;Sharifi and Siadati, 2008). However, this kind of identification technique needs to be updated in real-time, and the life cycle of the phishing website is much shorter than that of a normal website, which limits the efficiency and accuracy of these methods. Therefore, some scholars have proposed heuristic methods (Jain and Gupta 2018), but the rules defined by these methods are relatively simple and easy to be circumvented by attackers. With the rapid growth of machine learning, classification algorithms are widely adopted to learn the differences between phishing websites and legitimate websites, and distinguish the websites with high risk (Sahingoz et al. 2019).

Ethereum phishing detection
The enormous value of cryptocurrencies has caught the attention of criminals. At the same time, the anonymity of cryptocurrencies also facilitates these illegal activities. Compared with traditional scenarios, phishing scams on Ethereum can be more diverse , because phishers can spread their accounts to victims through various channels. Therefore, traditional methods cannot be directly used for Ethereum phishing detection.
To perform phishing detection on Ethereum, existing research makes full use of its open-source transaction data. Among these works, Podgorelec et al. (2019) and Chen et al. (2020c) adopt feature extraction methods to investigate the characteristics of nodes, which are then employed to train the detection model. However, these works do not fully mine the structural and time series information of transaction networks, resulting in suboptimal results.
Different from previous works, Yuan et al. (2020) adopt a graph embedding method node2vec (Grover and Leskovec 2016) to capture the node features of the transaction network, which provides new insights for follow-up research. Chen et al. (2020a) sample the subgraph by a random walk through the neighbor relationship of the largest connected component and learn node features based on the Graph Convolutional Network. Owing to the importance of the timestamp and transaction amount , Wu et al. (2020) subsequently propose a trans2vec embedding detection method, which considers these characteristics in a walking strategy to learn users' representations. Wang et al. (2021) extract the subgraph of each address in the transaction network and then adopt subgraph2vec (Narayanan et al. 2016) to obtain their features. After that, these features are used to train a classification model. However, these methods are highly dependent on the network structure and have high training costs, so they cannot be easily updated to adapt to the dynamic changes of the network.
Inspired by existing research, we propose an early-stage Ethereum phishing detection method to capture structural and time series information that can be easily obtained and updated to detect phishing users. Furthermore, we pay attention to the temporal order of node appearance when dividing the dataset to ensure the validity of the experimental results.

Definitions and preliminaries
In this section, we first introduce the basic definitions of the Ethereum transaction network and early-stage phishing. Then, we formulate the early-stage phishing detection problem.
Ethereum transaction network. A transaction graph GðV; E; X; YÞ is a multidigraph, where V is the set of nodes, E ¼ fe ij ; i; j 2 V; i 6 ¼ jg is the set of edges. Let v i denote an account and e ij denote a transaction from v i to v j . X is the set of edge attributes: the edge attributes of e ij are a ij and t ij , where a ij refers to the transaction amount from v i to v j , and t ij is the timestamp of the transaction. Y is the set of node labels, where Y ¼ fy i ; i 2 Vg, y i ¼ 1 for the phishing node, and y i ¼ 0 for the normal node. Note that there may be multiple transactions between a pair of nodes.
Early-stage phishing scams. Phishing scams can be active for a while; we denote the period of each phishing scam as early, middle and late stages according to their total phishing amount. For each node, we define in-transactions as transactions from other nodes to it, and denote st and et as the start time and end time of these in-transactions, respectively. Given timestamp T, we focus on the ratio of the current fraud amount at T to the total fraud amount at et, st T et. When the ratio is in the ranges of 0-33.33%, 33.33-66.67%, and 66.67-100%, we call the current stage the early stage, middle stage, and late stage, respectively.
Early-stage phishing detection. Given the current timestamp T, suppose there are several phishing scams on the transaction network. Some accounts have been reported by victims as phishing nodes, while some continue to deceive users, we denote them as marked phishing nodes and unmarked phishing nodes. Take Fig. 2 as an example, there are three kinds of nodes currently: red nodes, gray nodes, and white nodes, which represent marked phishing nodes, unmarked phishing nodes, and normal nodes, respectively. Each red node evolves from a gray node, and the purpose of this study is to detect the gray node as early as possible to effectively reduce users' losses.

Methodology
In this section, we first introduce the data processing component. Then, we provide a detailed description of the feature extraction component. Based on these two components, we introduce the phishing detection part.

Data processing
Due to the temporal nature of transactions, randomly splitting the dataset for the following task is not suitable here. It should be noted that the nodes in the graph may have transactions with others at different times. Therefore, we take three steps to process the transaction data and build training, validation, and test sets (as shown in Fig. 3).
(1) Positive sample splitting Given the current timestamp T and time window w, we build the transaction network G TÀw where all transactions occurred before T À w, which is the network that we adopt to extract node features. Then, we denote the marked phishing nodes with et T À w as positive samples of the (2) Negative sample sampling To maintain temporal consistency, for each positive sample, we randomly select a normal sample whose et is close to its et as a negative sample. Note that in each split, nodes in the test set must have larger et than nodes in the validation dataset, and nodes in the validation dataset must have larger et than nodes in the training set.
(3) K-fold data splitting To make full use of the transaction data and ensure the robustness of the method, we split the transaction data k times. As shown in Fig. 3, we repeat the above steps k times and move T backward by the time window w each time.

Feature extraction
In general, the feature extraction component LTFE consists of two parts: (1) extracting features based on the local network and (2) extracting features based on the time series of the transaction amount and transaction timestamp. Taking the target node u as an example, Fig. 4 shows the whole process of the LTFE method. The details are discussed in the following subsections. It should be noted that given timestamp T, the features we extracted are based on the transaction network G TÀw .
(1) Feature extraction based on the local network With the target node u being the central node, as listed in Table 1, we measure it in 14 dimensions based on its context and mark them as local network-based (Ln-based) features.
Ln 1 , Ln 2 , and Ln 3 measure the number of transactions. Note that the degree of multigraph refers to the number of transactions, including multiple transactions between the same node pair. They measure the number of in-going edges, out-going edges, and the total number of edges of the target node, respectively.
Ln 4 , Ln 5 , and Ln 6 measure the number of neighbors, which is different from the number of transactions because a neighbor may have multiple transactions with the target node. They measure the number of upstream neighbors, downstream neighbors, and the total number of neighbors of the target node, respectively. For example, a target node may have only one neighbor but five transactions, then its Ln 3 is 5 and Ln 6 is 1.
Ln 7 ; Ln 18 ; Ln 9 consider the transaction amount. They measure the total transaction amount of upstream neighbors, downstream neighbors, and all neighbors of the target node, respectively.
Ln 10 , Ln 11 , and Ln 12 measure the average number of transactions between the same node pair. The expressions are as follows: Ln 11 ¼ Ln 2 ln 5 ð2Þ Owing to the specificity of phishing behavior, the phishing node would not transact with one neighbor frequently, while transactions of normal nodes tend to have the characteristics of continuity and repetition, which makes their average number of transactions greater than 1.
We define Ln 13 as the ratio of close neighbors, where ''close neighbors'' refers to those neighbors that conduct bidirectional transactions with the target node. It is believed that friends who interact with each other will be closer than those who do not; therefore, we define this feature to reveal the intimacy between the target node and its neighbors.
Ln 14 is the local clustering coefficient of the target node, which measures the cliquishness of the target node's    (Watts and Strogatz 1998). The expression is as follows: where N D is the number of triangles that contain the target node in the transaction network, and Ln 6 ðLn 6 À 1Þ=2 refers to the maximum possible number of triangles that contain the target node. Here, we ignore the number of transactions and the direction between node pairs. Since normal nodes conduct transactions based on various legitimate activities, such as commercial transactions, their neighbors may also belong to the same commercial activity circle, thus generating transactions with each other. However, the fraudulent motivation of phishing users makes their neighbors more dispersed; hence, they may have a smaller Ln 14 value than normal nodes.
(2) Feature extraction based on time series of transactions In addition to the network structure, the transaction network also has the characteristics of time series. In other words, the transactions of the target node are sequential. Similar to the definition of in-transactions, we define outtransactions as all transactions originating from the target node. Considering the transaction direction, we build four time series for each target node. Taking the target node u in Fig. 2 as an example, we arrange in-transactions and outtransactions in time order and use ts represent time series. Assuming that the timestamp t x 1 u \t x 2 u \t x 3 u \t x 4 u \t x 5 u (in-transactions) and t ux 6 \t ux 7 \t ux 8 (out-transactions), the four time series are as follows: ts InÀAmount ¼ ða x 1 u ; a x 2 u ; a x 3 u ; a x 4 u ; a x 5 u Þ ð 5Þ ts InÀtime ¼ ðt x 2 u À t x 1 u ; . . .; t x 5 u À t x 4 u Þ ð 6Þ ts OutÀAmount ¼ ða ux 6 ; a ux 7 ; a ux 8 Þ ð 7Þ ts OutÀTime ¼ ðt ux 7 À t ux 6 ; t ux 8 À t ux 7 Þ ð 8Þ where ts InÀAmount is the amount-based time series of intransactions, and ts InÀtime is the time interval series of intransactions. Similarly, ts OutÀAmount is the amount-based time series of out-transactions, and ts OutÀtime is the time interval series of out-transactions. Then, we choose the statistical measures shown in Table 1 to reveal the transaction characteristics of the target node. For the sake of clarity and simplicity, we mark the features of ts InÀAmount , ts InÀTime , ts OutÀAmount , and ts OutÀTime as In A, In T, Out A, and Out T plus the feature number. For example, the features of ts InÀAmount are marked as In A1, In A2, …, In A10.
Ts 1 is the number of elements in the time series ts. Note that this feature is the same as other Ln-based features. Take amount-based time series ts InÀAmount as an example, Ts 1 is the number of in-transactions, which equals the indegree of the transaction network. Here, we keep these kinds of features to investigate the performance gains of Ln-based features and Ts-based features in the experiments in Sect. 5.5.
Ts 2 is the sum of elements in ts. For amount-based time series, Ts 2 is the total amount of transactions; for timebased time series, Ts 2 is the time span of transactions.
Ts 3 ; Ts 4 ; Ts 5 ; Ts 6 , and Ts 7 are the mean value, the maximum value, the median value, the minimal value, and the standard deviation of ts, respectively.
Ts 8 is the skewness of ts, which reflects the asymmetry of the time series distribution: where x i is an element of ts, x is the mean of samples, and n is the number of samples in ts. Positive skewness means that most of the values are on the right side of the average value, negative skewness means that most of the values are on the left side of the average value, and zero skewness means that the distribution of ts is symmetric. Ts 9 is the kurtosis of ts, which measures the tailedness of ts: where ts is a normal distribution when its kurtosis is zero, and ts is a thin-tailed distribution when its kurtosis is less than zero, and ts is fat-tailed distribution when its kurtosis is greater than zero. Since tails represent the probability of values that are extremely high or low compared to the mean value, when there is not enough data in ts, kurtosis cannot be calculated. In this case, we assume that it satisfies a normal distribution, which means that the extreme values are neither highly frequent nor highly infrequent. Ts 10 is entropy, which measures the uncertainty of ts. We perform equidistant binning operations on the value of ts: k opt ¼ minðbin; Ts 1 Þ ð 11Þ where k opt is the number of subintervals of ts, bin is the number of subintervals we set, Ts 1 is the number of ts; and p k represents the proportion of the elements of ts falling in the k-th subinterval. In this study, we set bin to 5. The smaller the value of entropy, the more uniform the distribution of the time series.
Early-stage phishing detection on the Ethereum transaction network 3713

Phishing detection process
After data processing and feature extraction, we train a classification model to detect suspicious nodes. We first train several classifiers on training sets, then adjust the model parameters and choose the final classifier based on the detection results of the validation. Noting that we split the data multiple times in the data processing component, here we use the Wilcoxon signed-rank test (Wilcoxon 1945) to test the statistical significance of the results and report the average performance, which is recommended by Demšar (2006). Finally, we use the selected model to detect the test set.

Experiment
In this section, we evaluate the proposed method on a realworld public Ethereum transaction dataset.

Dataset
To evaluate the performance of the proposed method, we use the commonly adopted public dataset provided by Chen et al. (2020a). Its transaction data and credible label data are provided by the authoritative platform Etherscan, 3 and the time span ranges from 2016 to 2019. The whole transaction network contains 2,973,489 nodes, 13,551,303 edges, and 1165 labeled phishing nodes. As is well known, the characteristics and patterns of behavior are extracted from a certain number of behavior bases, and transaction behavior is no exception. Considering that at least 4 transaction records are required to calculate the feature skewness, and at least 5 transaction records are required to calculate the kurtosis, we select accounts with an in-degree of not less than 5 for research. There are 757 phishing accounts that meet the requirements of this article, and their fraudulent amount accounts for 93.93% of the total fraudulent amount. On the other hand, we want to keep as many phishing nodes as possible to fully learn their transaction characteristics. If we increase the in-degree requirement, the number of phishing nodes that satisfy the requirement will decrease.
In Table 2, we summarize the basic statistics of these phishing accounts in detail. The amount of ether scammed by the phishing account reflects the user's losses, and 50% of the phishing accounts defraud more than 17.6419 ether each, with the largest scam even exceeding 42,000 ether, which further reveals the harmfulness of phishing scams. The number of transactions scammed by the phishing account refers to the number of transactions attracted by the phishing account. It can be seen that half of the phishing accounts commit fraud more than 17 times. A phishing account even scams more than 5,000 times. The number of days that each phishing scam lasts reflects the duration of the phishing scam. More than half of the phishing scams lasted more than 6 days, and these scams defrauded 77.01% ether of all the fraudulent amounts, which provides the possibility of early-stage phishing detection.

Experimental setup
In the data processing component, the number of data splits k is 12, the width of the moving time window w is 15 days, and the first split timestamp T is 2018-2-14. Table 3 lists the details of the training set, validation set and test set, where N 1 ; N 2 and N 3 are the number of positive samples in each data set. For the test set,N 4 ; N 5 and N 6 are the number of positive samples in the early, middle and late stages.
It should be pointed out that this research has some limitations that can be improved. The width of the time window w is a hyperparameter, which represents the update cycle of the model. If w is set to a small value, few new marked phishing scams are available to retrain the model. If w is set to a large value, we can only split the dataset a few times, which is not suitable for conducting the following statistical testing. Therefore, we set w to 15 days to ensure that the model can be retrained with enough new data and the following experiments can adopt statistical testing.
Two embedding methods for phishing detection were selected as baselines: node2vec ) and trans2vec ). The reasons for choosing these two methods for comparison are as follows. First, the transaction network has a natural graph structure, and most of the existing research is based on graph embedding algorithms. Second, node2vec is one of the classic graph embedding algorithms, which considers the weight and the walking direction at the same time. Third, based on the walk of the amount value and timestamp, trans2vec is an improved embedding method of node2vec, which achieves the best performance in the phishing detection task on Ethereum transaction network.
For embedding methods, parameters were set according to the guidance given by Wu et al. (2020). Among them, embedding size, walks per node, walk length, and context size was set to 64, 20, 5, and 10, respectively; for node2vec, the walk parameters p and q were set to 0.25 and 0.75; for trans2vec, the balance parameter of the amount and time a was set to 0.5.
To ensure the robustness of the experimental results, we repeated the random selection process of negative nodes 10 times in the data processing part and set the significance level to 0.05 for the Wilcoxon signed-rank test to analyze the statistical significance of the results.

Feature selection
Feature selection plays a critical role in building machine learning models. As some features are highly correlated, we used the filter method to remove these features. First, we computed the average correlation coefficient of each feature pair (as shown in Fig. 5). Note that the computation of the correlation coefficients only used the information of the training set. Then, we compared the correlation and removed one of the feature pairs whose correlation was higher than 0.9. We can observe that the adjacent features have a higher correlation. Finally, we chose 45 features to conduct the following experiments.

Evaluation
We choose the following four widely adopted metrics Accuracy, Precision, Recall and F 1 : Precision Recall For each node, we denote its predicted label as b y i ; then, we regard Pðb y i ¼ 1jy i ¼ 1Þ as True Positive, Pðb y i ¼ 0jy i ¼ 1Þ as False Negative, Pðb y i ¼ 0jy i ¼ 0Þ as True Negative, Pðb y i ¼ 1jy i ¼ 0Þ as False Positive. We used TP, FN, TN; and FP to denote the numbers of true positives, false negatives, true negatives, and false positives, respectively.

Performance comparison
We first compare the detection performance of different methods on the test set and then adopt the Wilcoxon signed-rank test to conduct a statistical analysis of the results. Here, we adopt the LR method as the classifier. From the results summarized in Table 4, we draw the following observations and conclusions. The LTFE method outperforms other baselines significantly, indicating the importance of local graph structure and time series in capturing user transaction characteristics. The time consumption of the LTFE method is much less than embedding methods in extracting node features, which reflects the efficiency of the LTFE method.
Furthermore, the classifier is also an important factor for detection. We compare the detection performance of Naive Bayes, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and LR on the validation set (Kotsiantis et al. 2006). The performance results are summarized in Table 5. According to the results, the LR model with C ¼ 0:5 and solver = 'liblinear' performs better than other methods on validation set statistically, where C is the inverse of regularization strength, and solver is the algorithm used to optimize the problem. Therefore, we selected it as the classifier in this study. Early-stage phishing detection on the Ethereum transaction network 3715 Finally, we conduct phishing detection at different stages. Figure 6 shows the detection performance at the early stage, middle stage, and late stage. From Fig. 6, we can observe that: (1) the LTFE method significantly outperforms node2vec and trans2vec on early-stage phishing detection tasks; (2) as the stage increases, the advantage of LTFE declines slowly, but still better than other methods; (3) our method LTFE can achieve robust detection performance at three stages, while the embedding methods are more fluctuate than the LTFE method. These phenomena show that the features extracted by the LTFE method can effectively capture user transaction features at any stage, while the embedding method relies on sufficient transaction information to learn effective transaction features.
As time increases, new phishing scams have emerged in the transactional network, and their fraudulent tactics may differ from those of past phishing scams, which creates new challenges for the detection models. Fortunately, our method can train a model with newly marked phishing scams to detect early phishing users at the early stage quickly. Figure 7 shows the early-stage phishing detection performance under different timestamps. The timestamp here is the moment we set to split the transaction data in the data processing component. As can be seen from Fig. 7, the LTFE method consistently achieves better results than the other methods on all metrics, which reflects the LTFE method's adaptability to new phishing scams.

Feature importance analysis
To further analyze these features on early-stage phishing detection, the recursive feature elimination (RFE) method (Guyon et al. 2002) was adopted to select the 10 most important features. We set the LR model as the external estimator. First, the estimator was trained on the training set, and the coefficient of each feature was obtained; then, the least important feature was deleted from the set of features. This procedure was repeated until the number of remaining features was eventually reached. After filtering, the 10 most important features were obtained. Table 6 lists the statistics of these features, where pos represents the phishing samples and neg represents the negative samples.
From Table 6, we can observe that: (1) The 10 most important features are related to the local network, ts InÀAmount , ts InÀTime , and ts OutÀTime .
(2) For Ln-based features, the statistical values of Ln10 and Ln12 of the positive samples are smaller than those of the negative sample, which indicates that the phishing node has fewer transactions with its neighbors than the normal node; in addition, the positive samples have a much larger mean value, 50% value, and 75% value than negative samples on Ln4, which indicates that phishing nodes have more upstream neighbors than normal nodes. (3) For ts InÀAmount features, the statistical values of In_A8 and In_A9 are more uniform than normal nodes, which indicates that phishing nodes have a more uniform distribution of fraud amount than normal nodes.   Indicates that this metric is statistically smaller than the metric of the LTFE method Based on these features, regulatory authorities can develop specific rules for real-time monitoring and preliminary screening of suspicious accounts.

Conclusions
In this paper, we propose a three-component framework called EPD to address the early-stage phishing detection problem of the Ethereum transaction network. We first propose a strictly time split method to divide the dataset; then, we develop a feature extraction method called LTFE to capture the characteristics of local network structure and transaction time series. We evaluated the proposed method on a real-world Ethereum transaction network and compared it with two embedding methods, with results showing our method achieved great performance and reliability on early-stage phishing detection tasks.  Finally, this research should continue to advance in at least the following two aspects: (1) the dataset used in the paper is not rich enough. As time increases, Ethereum continues to produce new transaction data, and we can use richer transaction data in the future for further verification; (2) there are many other kinds of illegal and criminal behaviors in cryptocurrency transactions. This paper only focused on phishing behavior. In future work, we plan to extend the application of the proposed method to other illegal behavior detection problems.