HTSE: hierarchical time-surface model for temporal knowledge graph embedding

Representation learning based on temporal knowledge graphs (TKGs) has attracted widespread interest, and temporal knowledge graph embedding (TKGE) expresses time entity and relation tokens and exhibit strong dynamics. Despite the significance of the dynamics and the persistent updates in TKGs, most studies have been devoted to static knowledge graphs. Moreover, previous temporal works ignored the semantic hierarchies observed in knowledge modelling cases, which are common in real-world applications. Inaccurate semantic expressions caused by incomplete projections might not capture complex topological structures very well. To solve this problem, a novel hierarchicaltime-surfaceembedding (HTSE) model is proposed for the representation learning of entities, relations and time. Specifically, a unified relation-oriented hierarchical space aims to distinguish relations at different semantic levels of a hierarchy, and entities can naturally reflect the corresponding hierarchy. Then, a time surface aims to enhance the temporal characteristics, and quadruples are learned through exponential mapping and tangent planes in the time surface. According to extensive experiments, HTSE can achieve remarkable performance on five benchmark datasets, outperforming baseline models for time scope prediction, temporal link prediction and hierarchical relation embedding tasks.Furthermore, the qualitative analysis is used to demonstrate the explainable strategy for hierarchical embeddings and their significance in TKGs.

relation embedding tasks.Furthermore, the qualitative analysis is used to demonstrate the explainable strategy for hierarchical embeddings and their significance in TKGs.

Introduction
Because obtaining rich semantic information from multirelational graph-structured data has become a challenge in artificial intelligence research, multiclass knowledge graphs (KGs) have been proposed and developed for various applications.These include large KGs, such as YAGO, Wikidata, Freebase, WordNet, and DBpedia [24], and they have been utilized in numerous mainstream applications, such as automatic question-and-answer systems, information retrieval systems, and recommendation systems.Traditional static KGs are known to take the form of triples (s,p,o), where s represents a subject, p represents a predicate and o is an object.For example, the triple (LeBron James, plays for, Cleveland) was true at a time node corresponding to 2018.However, some facts (entities or relationships) in a knowledge base may change over time, which can lead to conflicts between new facts and previous facts, thus making the representations of the corresponding KGs inaccurate.For example, the triple (LeBron James, plays for, Lakers) replaced the previous triple in 2022.
To accommodate such dynamics, TKGs have been proposed.In particular, for the commonly used Global Database of Events, Language, and Tone (GDELT) [16] and the International Crisis Early Warning System (ICEWS) database [19], TKGs can reflect remarkably important differences by incorporating time information into existing graph data in the corresponding static KGs.This approach can more accurately reflect the dynamics of facts and the timeliness of real KGs.However, the incompleteness has yet to be addressed in existing TKGs.Thus, related embedding tasks, such as temporal link prediction, have remained challenging.
TKGE is a representation learning method in which time information, entities and relations are expressed in a time-specific space in vector form to transform high-dimensional data into low-dimensional vectors.The principle of TKGE is shown in Figure 1.In this figure, the principle of TKGE is actually to convert the graph into the form of quadruples and can represent entities, relations, and time stamps intuitively in TKGs.In addition, even with the same head entities and relations, the quadruples corresponding to different times are still different.For instance, (Kevin Durant, plays for, BKN) is correct in 2023.1.3,but he currently plays for PHO.The example shows that the inside knowledge cannot be updated resulting in errors if the temporal property is ignored.It can affect the inference and recommendation application of the knowledge graph.Therefore, it actually reflects the importance of temporal information for the correctness of the knowledge graph links.
Several existing mainstream TKGE models, such as HyTE [9], TTransE [15], ChronoR [24] and GIE [3], ignore the algebraic representations of curvature vectors and inherent topological information.This results in the inaccurate projection of vectors in scenarios with rich interactions between temporal properties and multirelational features in TKGs.However, a surface model is suitable for this scenario because it can solve this issue by embedding high-dimensional curvature vectors.Therefore, it is necessary to explore a time surface model for TKGE.This is one of the major motivations for this work.Another major motivation is the lack of modelling semantic hierarchy in existing models.A semantic hierarchy is an indispensable temporal knowledge graph property.For instance, the quadruple (Rio de Janeiro, capital of, Brazil, 1764-1956) is true before relocation of the capital to Brasilia, where Brazil is at a higher level than Rio de Janeiro.In Figure 2, the quadruple (Obama, born in, Honolulu, 1961-08-04) presents a person born in a city where the city is at a higher level than the person.More specifically, the ICEWS dataset contains quadruples (Obama, demand, Korea, [2014-04-25]) and (USA, criticize or denounce, Korea, [2014-04-28]), where a person 'Obama' is at a lower hierarchy than countries 'USA' and 'Korea' in the hierarchy.If there exists a quadruple ((Obama, criticize or denounce, Korea, [2014-04- ), it is obviously contrary to the fact.This is because different hierarchies of Obama and USA are ignored, resulting in incorrect quadruple inference.The above case shows that it will lead to the wrong inference of entities for link prediction if the hierarchy modelling is unclear.Furthermore, it can make fundamental mistakes in representation learning and affect the quality in knowledge graphs.Although some works have focused on the issue of static semantic hierarchies, they have been limited to static embedding and have ignored temporal scenarios.This can lead to an inability to model complex relations and incomplete semantic expressions.Therefore, it is still challenging to research a strategy to represent the semantic hierarchy accurately and effectively in TKGs.Then, it is crucial to explore a time-surfaced model that can process entities and relations in each temporal knowledge graph hierarchy.
To overcome the above issues and fill the research gaps, this work proposes a novel hierarchical time-surface embedding (HTSE) model.First, the given entities and relations are embedded in a relation-oriented hierarchical space and can clearly reflect the different hierarchies of importance levels among the multihierarchical entities connected by each relation.Then, the resulting quadruples are projected onto a time surface and used to represent the rich interactions via exponential mapping, which can improve the semantic expression ability of HTSE.Our experiments demonstrate the superiority of the prediction process and the validity of HTSE hierarchy modelling.We summarize the significant contributions of this work below.
• To address the issue of ignoring hierarchical knowledge when modelling, a relationoriented hierarchical modelling strategy is proposed to capture semantic hierarchies more completely with regard to the entities and relations in TKGs.First, three separate imaginary components α, β, and γ are used in the embeddings of the entities and relations within the hierarchical space.Then, relations at different semantic levels of the hierarchy can be distinguished, and entities can naturally reflect the corresponding hierarchy.The processed entities and relations are taken as the quadruple bases following the introduction of the time attribute.• An HTSE model is proposed to address the shortcoming of semantic expression and to capture the deep topology information resulting from uneven quadruple projections.Different from existing geometry-based methods, in the proposed approach, the semantic hierarchy space is transformed into a time surface, and various elements of a quadruple can be expressed on this time surface first.Next, the time surface is divided into several local manifolds by timestamps, and the quadruples are accurately and more intuitively embedded via an exponential mapping approach.In other words, the surface embedding model can improve the accuracy of representation learning with respect to semantic hierarchies and time dynamic characteristics.
• Extensive experiments are conducted on temporal link prediction and time information prediction.The experimental results show that the HTSE model can improve the accuracy of embeddings and the capacity for knowledge completion over that of the baseline models.The advantages of the proposed hierarchical knowledge modelling strategy can be illustrated in an analysis of hierarchical relation embeddings.Furthermore, the explainable strategy for link prediction on different hierarchies is shown in the qualitative analysis.
Our proposed HTSE model is a type of dynamic geometric model.In particular, HTSE shares similarities with DyERNIE, studying dynamic relation embeddings based on Riemannian manifolds.However, there are two major differences:

The proposed HTSE
To address the issues of strong link association, temporal properties, semantic hierarchy and projection integrity that existing models ignore, a novel HTSE model is proposed and analyzed in this section.We emphasize the process of semantic hierarchy modelling and the derivation of a time-surface embedding space in this section.Specifically, HTSE contains four parts: 1) To address weak link association and unclear semantic hierarchy, relation-specific and hierarchy-aware space is presented in the first subsection.It can enhance the correlation between entity pairs and clearly model hierarchies.2) To tackle the lack of temporal information modelling and incomplete projection in semantic space, time surface and exponential mapping module is presented.It can achieve the effect of modelling high-dimensional temporal entities and relations to enhance temporality.3) As described in subsection score function, it can support entity and relation embedding calculations.4) A proven training method to prepare adequately for the experiment in the last training.
A list of specific notations and descriptions is provided in Table 1.Distinct gradations are present among these four parts.

Relation-specific and hierarchy-aware space
A TKG possesses dynamic characteristics among its entities and relations over time.HTSE transfers entities and relations to a relation-oriented space by integrating the embeddings of entities and relations into a hierarchy-aware space.Given a graph containing sets of entities E and relations R, we represent the algebraic representation learning of entity pairs as h, t ∈ E.Then, we assume that relation vectors are r i = [r i1 , r i2 , . . ., r im ] ∈ R at the i th hierarchy level.As illustrated in Figure 3, entities and relations are represented by different coloured circles and directed arrows, respectively, on an equipotential surface expressed by dashed lines.The three axes form the corresponding semantic space.In hierarchical graph G r , the green/blue points at the same level represent the head/tail entity vectors h i and t i , respectively, and the corresponding circles represent First, we introduce a principle to obtain subgraph from G r in each hierarchy.Specifically, we classify the hierarchical graph G r into different subgraphs according to different knowledge hierarchies.For instance, 'U S A' and 'K orea' share the same subgraph because they are both country-hierarchy entities.In contrast, entity Obama is a person name and obviously cannot belong to the same level as the country.Therefore, 'Obama' is classified as another subgraph.All delineated subgraphs can be combined as G r , where and n is the quantity of hierarchies.
Then, given a subgragh g r = (h, r , t) in hierarchy, HTSE represents the embeddings of entities within the hierarchical space.Inspired by QuatER [20], the embeddings h and t of the head and tail entities are represented as: , where h and t form a new hierarchical entity pair, and the hierarchical relations r ∈ R link these entities to each other, h r and t r ∈ R n .h α,h , h β,h and h γ,h ∈ R n are the elements of entity decomposition in semantic space.Three imaginary components α, β and γ represent

Time surface and exponential mapping
The idea of the model is first to integrate the entities in the knowledge graph into a unified relation-specific and hierarchical space (as in Section 2.1).After that, it attaches the embedding of temporal information.Therefore, this subsection aims to complete the representation learning of temporal information and make our model applicable to most TKGs.As shown in Figure 4, there is an example of the model HTSE embedding process that summarizes the process for this quadruple.We set the input as a quadruple instance.The instance is processed as a vector representation on the manifold space by entity pair embedding, relation enhancement and hierarchy modelling.In addition, temporal embedding plays a significant role in forming the vector representation.
Structured knowledge should be valid only within a specific time range, and not considering this temporal information can result in dynamic deletions within fact expressions.Therefore, it is worth proposing a new time-surface model for entities, relation reasoning, and time period prediction.This section introduces a time-surface aware model that is proposed as an alternative to existing methods.Specifically, the space processed as discussed in Section 2.1 is defined as G r , and a time axis passing through G r is transformed to incorporate The transformational time surface is set as a combination of several local manifolds M and tangent planes T p M, representing the correlations between entities and relations on the surface (as depicted in Figure 3).Moreover, exp p (v) represents the connection in the time surface between two quadruple time properties.It is an exponential mapping from the starting to ending time stamp, and the temporal relation modelling is represented by the red real line.The red line l p,v represents the temporal relation after projection on the time surface.
A k-dimensional manifold M is a real and smooth manifold with Euclidean structure.There exists a corresponding quadruple on each manifold can be expressed as (E, R, τ ) ∈ M. For each point x ∈ M, the definition is presented as: exists, exp p is an exponential mapping.The details of exponential mapping are as follows: Since the geodesic l p,v is defined locally, exp p (v) can only be defined for an open subset on T p M, for example, by setting where δ is a constant associated with point p.The exponential mapping is as follows: holds.The geometric significance of the exponential mapping means that exp p (v) is the point starting from p with the initial tangent vector v of length v .From the properties of geodesic l p,v (τ ), we know that dτ is an invariable constant.Then, we can obtain that (1), p = l p,v (0), is known for deriving the arc length from point p to exp p (v) as follows: Therefore, the arc length at point p along the geodesic to exp p (v) is exactly equal to v , as depicted by the red curve in Figure 3. Finally, exp p (τ ) = is obtained.This proves the feasibility of exponential mapping.
By analysing Figure 3 and combining it with the above derivation procedure, the temporal relation in the time interval [τ i , τ i + τ ] can be defined as: where τ i is the starting time stamp in the local manifold and r 0 ∈ M is the initial relation vector indicating the initial embeddings without time fluctuations.v 0 ∈ T p M represents the temporal relation-specific curvature vector, which is defined in a tangent space that captures the dynamic features of the relations and entities over a time transformation.Graphic relations are based on derivations from the initial embeddings in combination with a tangent vector to easily represent the stationary semantics of relations/entities.Then, special attention is given to additional temporal information.The final three-component element is represented as (h r , r τ , t r ), and a time component can be added to form a new quadruple (h r , r τ , t r , [τ 1 : τ 2 ]).

Score function
To develop an adaptive measurement to support the score function, we adopt the standardized Euclidean distance.The standardized Euclidean distance measurement method has been demonstrated to be the most accurate method for calculating the distance between vectors.
This method can overcome the inaccuracy of traditional methods caused by the uneven distribution of data in each dimension, which can lead to low link prediction accuracy.For this purpose, each component is first standardized to an equal mean variance to balance dimensional components as follows.We set the standard deviation of the weighted vector for entities/relations to S k , and the score function of HTSE(control) is To model these relations better, we incorporate the idea of error control to extend HTSE.r τ − r 2 ensures that the time-specific relation vector r τ is the best nearest neighbour for calculating the distance from the original vector r 0 , and μ can control the constraint as a hyperparameter.

Training
We explore the set of negative samples.A negative sampling method for temporal perception is proposed to emphasize temporal information that considers all quadruples in TKGs.Similar to previous works, the ranking loss based on a marginal style is minimized as follows: where Q + τ are valid quadruples and Q − τ are negative samples, both of which contain time information.
Since the model possesses a geometric structure, we make use of Riemannian stochastic gradient descent (RSGD) [11], in which the Riemannian gradient R L is obtained by normalizing the E Euclidean gradient with respect to the inverse of the metric tensor of the time surface.
The characteristic HTSE steps are shown in Algorithm 1.The triples of the quadruples in the datasets are introduced as inputs in = (h, r , t).Then, we process the relations and entities separately by calculating entropy values e j to model the unified relation-specific and entity-weighted spaces.In addition, it is necessary to adopt exponential mapping on the time surface T by projecting time-aware space and providing accurate calculations.Finally, the projected vectors and time information on the surface can be obtained.It is obvious that the weighted process and the emphasis on time attributes are incisively and vividly reflected by displaying (1) and (3) in the whole process.

Algorithm 1 HTSE time-surface estimation.
Input: Entity pairs h and t, relations r , transformed vectors v h,r ,• and v t,r ,• ; Three hierarchical space vector decomposition units α, β and γ ; Initial relation r 0 , initial curvature vector v 0 local manifold M, time surface T segment K: K 1 , K 2 , . . ., K m , tangent plane T p M Output: Projected entity and relation vectors on the time surface: h , t and r τ 1: for i = 1 : k do 2: for j = 1 : m do 3: end for 7: Nontemporal embedding models: A number of mature static knowledge graph embedding methods have recently become available.To address the issues of semantic loss and promote the accuracy of complex relation embeddings, various embedding space models have been proposed, such as the TransE [2], TransH [28], TransO [17], and TransR [18] models.These translation-based methods follow the principle of the closest distance between entity and relation vectors and have achieved success.To solve the issue of algebraic ill-posedness and the insufficient adaptivity of geometric forms, different types of geometric-based models have been proposed with excellent performance achieved through link prediction tasks.ManifoldE [29] reviews knowledge representations in a manifold space in regard to topology.Topology-aware associations are also effectively exploited between relations in TACT [5].Furthermore, to solve issues such as symmetry, antisymmetry and inversional relations using linear function, RotatE [25] and LineaRE [23] proposed rotation/liner ideas to embed the relations of entity pairs into the corresponding space.Static hierarchical embedding models, such as HAKE [4,31], HittER [6] and HBE [21], use different operations of hyperbolic reflection to the multiple hierarchical relation patterns of the model to achieve better results.In addition, [7] and [32] can enhance link prediction by means of interpretable embeddings and counterfactual links in knowledge graphs.
Temporal embedding models: Recent studies on TKGE models, including extended models and entity dynamics models, have sought to enhance the performance of temporal prediction [13].To make them more influential and extensible, these methods are actually extensions of previous static KGs models.[15] and [1] focus on introducing relational embedding variables extended to quadruples to upgrade a static model to a model with a time attribute, e.g., TTransE, TA-DistMult and DE-SimplE [10].RE-NET [14] models a sequence of events through a recurrent neural network event encoder and an adjacent aggregator.HyTE [9] is based on a hyperplane representation of the time space and expands the integration of the temporal information and element representations into TKGs.DyERNIE [11] introduces dynamic evolution in the form of a Riemannian manifold to capture the dynamic characteristics of TKGs using velocity vectors.TIMEPLEX [12] is an improved variant of ComplEx [27] that automatically utilizes the recursive properties of relations and temporal interactions.Know-Evolve [26] and EvoKG [22] use a deep evolution network structure based on a knowledge system for temporal reasoning.Inspired by human cognition, [8] proposes a subgraph-based model for answering complex questions over temporal knowledge graphs.It provides guidance for the idea of subgraph division in this paper.ChronoR [24] captures the rich interaction between temporal knowledge graphs and multirelation features with a high-dimensional rotation as its transformation operator.
Unlike the previous methods, a novel hierarchical time-surface embedding model HTSE is introduced to address the issue of temporal semantic expression and to capture the deep topology information resulting from uneven quadruple projections.It can also focus more on modeling hierarchical entities and relations accurately in knowledge graphs.The bottom line is that HTSE can integrate temporal information augmentation and semantic hierarchical modelling better to accommodate mostly temporal knowledge graphs.However, other successful existing models are not considered to represent semantic hierarchy and are limited to nonsurfaced TKGs projections.Therefore,the HTSE is proposed and analysed comprehensively in this paper.
• Evaluating our model and comparing it with static and dynamic models in terms of temporal link prediction.• Illustrating the advantage of HTSE based on surfaced projection with respect to temporal scope prediction.• Analysing the differences between the results of the state-of-the-art (SOTA) models and HTSE regarding hierarchical relation embedding.
• Presenting queries based on fact datasets to demonstrate the strategy validity of our model for hierarchical embeddings.

Fundamental setup
The abovementioned datasets contain facts associated with time annotations.The dataset statistics are summarized in Table 2.The hyperparameters and optimization procedure are presented in the experimental implementation details.

Baselines and evaluation protocol
To provide an overall presentation of the superiority of HTSE, we select several excellent representative learning models with static and temporal properties as our baselines.Specifically, we use TransE [2] and RotatE [25] as representative static models.As representative TKGE models, we choose the corresponding baselines, i.e., HyTE [9], DyERNIE [11], TIMEPLEX [12] and ChronoR [24].We adopt the MRR and Hits@n (n=1,3,10) standard metric models to evaluate the link prediction performance.
For each quadruple q = (h, r τ , t, τ ) in the test set , where rank i denotes the ranking of the first correct answer in the i th Q τ , and Hits@n is defined in [2].
Furthermore, similar to the operations executed under the 'raw' and 'filtered' settings in TransE [2], inspired by [12], we report a filtered version of Hit@3.Specifically, we replace the head/tail entities with other entities when testing quadruples during evaluation.The resulting corrupted quadruples may be correct.The 'raw' and 'filtered' indices are treatments for the test set, where the 'filtered' is the correct quadruple filtered out of the corrupted quadruples.
Experimental implementation details To present an impartial comparison between HTSE and the baselines, we utilize the experimental setup for the classic HyTE and DyERNIE baselines as our experimental basis and select the optimal parameters for each model.Then, we use RSGD to train the baselines and optimize the hyperparameter setups in accordance with the MRRs obtained on the validation set.We set the maximum number of epochs to 5000 and fix the minibatch size to 1024.The remaining settings are as follows: the embedding dimensions d = 100, 200, 300, 500, 1000, the learning rate l γ = 0.05, 0.001, 0.005, 0.01, 0.1, η = 1, 3, 6, 12, the margin γ = 3, 6, 12, 36, 48, 120, and the error control parameter μ is varied in the range [0, 1].
Time complexity and space complexity analysis Then, we compare the complexity of space and time to illustrate the higher efficiency of HTSE.A higher complexity indicates that the efficiency may be lower.Details are shown in Table 3.
Table 3 compares different spatial-temporal complexity models based on the assumption that d 1 , d2 << n.Here, p and q denote the quantities of entities and relations, respectively.The d 1 and d 2 are the entity and relation-specific dimensionalities in the embedding space respectively (d 1 = d 2 frequently).The above training models provide a foundation for the open world assumption.As shown in Table 3, the lower space complexity and time complexity of HTSE mean that its efficiency is higher than that of other baselines.After the comparison in terms of complexity, we discuss the performance on current mainstream tasks.

Temporal link prediction
In the evaluation, we aim to illustrate the advantages of our model with respect to hierarchy and time-surface modelling in comparison with other models.TransE [2] and RotatE [25] are experimentally chosen for comparison due to their common embedding space and lack of consideration for temporal dynamics and hierarchies.HyTE [9], TIMEPLEX [12], DyERNIE [11], and ChronoR [24] are chosen due to their use of hyperplanes, Euclidean distances, complex spaces and Riemann manifolds instead of surface models.
The significant task is to predict the missing entity for an incomplete quadruple.Unlike previous works involving static KGs, this task can predict missing entities for quadruples in Temporal KGs.More formally, for negative samples derived from the gold-standard quadruple (h, r τ , t, τ ), we perform prediction on two categories: (h, r τ , ?, τ ) and (?, r τ , t, τ ).Following the same filtered DyERNIE [11] settings, we evaluate our model with the MRR and Hits@1, 3, 10 metrics mentioned in the Section 4.1 Evaluation Protocol.The results obtained with the above experimental settings are presented in Table 4.
Results and observations From Table 4, on the YAGO11k and Wikidata12k datasets, HTSE produces excellent link prediction results compared with those of the promising HyTE, TIMEPLEX and DyERNIE models as well as the traditional static models by virtue of considering a projected time surface instead of a traditional space, which can allow highdimensional data to be more accurately captured.Moreover, the experimental results show improvements greater than 4% in terms of both the MRR and Hits@10 metrics.These findings suggest that the hierarchy modelling in our model is more adaptable than other methods for link prediction.Note: Bold entries represent the best performance.
From Table 4, on the GDELT dataset, HTSE also achieves somewhat better performance except in terms of the Hits@1, as this metric may be affected by the time stamps of the validation facts.According to the other results, the temporal models are superior to the static models, thus demonstrating the significance of capturing time information.On the ICEWS14 and ICEWS05-15 datasets, the results of HTSE are also superior to those of the existing excellent models on average due to the more accurate expressivity in our model.Overall, HTSE outperforms the other models in terms of link prediction due to its advantages of hierarchy modelling and more accurate projections to preserve temporal interactions.In addition, error control is beneficial for improving the results because it can ensure that the time-specific relation vector r τ for calculating the distance from the original vector r is the best nearest neighbour.
Link prediction performance over time To clearly and comprehensively illustrate the superior performance of HTSE in terms of future temporal link prediction performance over time, we take the time information as an index and present corresponding comparisons on the ICEWS, GDELT, Wikidata and YAGO datasets, as depicted in the line chart in Figure 5. Specifically, the vertical axis filtered Hit@3 represents the Evaluation Protocol in Section 4.1, and the horizontal axis includes different time stamps (day, month and year).Here, we present the performance of HTSE under the 'filtered' settings to show that it exhibits superior temporal link prediction capabilities over time after the removal of corrupted validations or test triples.Furthermore, we choose HyTE [9], DyERNIE [11], ChronoR [24] and TIMEPLEX [12] for comparison because of their poor projection abilities and the fact that they ignore hierarchies.Figure 5 shows that the performance of these models fluctuates at different time stamps.We notice that HTSE, corresponding to the red line, consistently outperforms the other models for different time intervals, and its performance varies irregularly over time.Furthermore, this finding suggests that hierarchy modelling and surface projection achieve enhanced link prediction performance over time compared to that of other models.Inspired by these results, we consider that future performance can be predicted by simulating and fitting numerical curves following an existing time sequence.This is beneficial for addressing the issue of predicting future time periods.

Temporal scope prediction results
In this section, we focus on illustrating the advantage of HTSE based on surface projection for temporal scope prediction.Inspired by [12], the PG score function is proposed to evaluate the accuracy of time stamp completion and the temporal information prediction results obtained in Temporal knowledge graph embeddings.In addition, we test the results on the ICEWS14, ICEWS05-15 and GDELT datasets.HyTE [9], TIMEPLEX [12], DyERNIE [11], and ChronoR [24] are chosen for comparison because these models have certain abilities to predict temporal scopes, and none of them focus on hierarchical knowledge representations and surface projections.
Considering the scarcity and completeness of facts in TKGs, the ability to predict time information is indispensable.We wish to predict the time instance and time interval that target a given test quadruple (h, r τ , t, ?).According to the established time surface, the relations and entities are projected onto this surface to check the plausibility of the test triple.
To correctly predict the temporal interval, we should make use of the optimal nearest fact.Specifically, for (h, r τ , t, ?), the gold-standard time interval is T gold = [t s g , t e g ] (which consists of the starting and ending time stamps of a true fact), and this interval should be compared to the predicted interval T pr e = [t s p , t e p ] to determine the similarity of the prediction to the true fact.
In terms of the chosen evaluation metrics, the metric in TKBC [12] is not entirely applicable to this task because it is designed to address the problem of large differences in the time proportion for the TAC metric [12].For instance, two groups of intervals exist: the golden interval [2014, 2017] compared with the predicted interval [2010,2013] and the golden interval [7,10] compared with the predicted interval [11,14].Although the two groups share the same TAC score e | , the former time proportion is obviously less than the latter.That is, a 4-year difference in 10 years may usually be considered more serious than in 2010.However, the time proportion in datasets cannot be large for this task.In response, we set the PG score function as a metric inspired by improving the TAC metric with a two-parametric calculation to enhance the accuracy of the score results.
Specifically, the difference PG between two time intervals/scopes can be calculated as: The score function PG serves as a criterion for evaluating the accuracy of time predictions.As t s p −t s g , t e p −t e g → 0, PG approaches its maximum value, i.e., PG → √ 2. Thus, the higher the PG score (which is subject to the condition 0 < PG < √ 2), the closer the predicted time interval is to the gold-standard value for a given fact.The results of TTransE, TIMEPLEX, DyERNIE, HyTE and HTSE are compared in Table 5. Note: the scores are converted to percentages with a maximum score.Bold entries represent the best performance.As indicated in this table, the results of HTSE, TIMEPLEX and DyERNIE are vastly superior to those of HyTE and TTransE.These findings show that a more adaptive projection space can result in better performance in terms of PG.Moreover, the results of 71.3% and 64.2% achieved by HTSE on the ICEWS05-15 and GDELT datasets, respectively, are both the highest values.On the ICEWS14 dataset, our model is slightly inferior to DyERNIE and TIMEPLEX because of a few redundant entities in the datasets.HTSE outperforms DyERNIE and TIMEPLEX due to its innovative time surface, which offers higher representative power and thus improves the accuracy of the predicted temporal intervals.It is worth mentioning that DyERNIE and ChronoR are second only to HTSE, which further illustrates the benefits of utilizing specific geometry-based methods to enhance the performance of temporal scope inference by addressing the issue of incomplete projections in the embedding process.In other words, the surface projection benefits the presentation of temporal semantic information based on a given triple.This would be an exciting conclusion of this research.

Analysis of hierarchical relation embeddings
In this part, we aim to demonstrate that the HTSE model can effectively model hierarchical entities and relations at different levels by introducing a modulus in relation embeddings inspired by [31].In addition, it is shown that HTSE is more accurate than the similar DyERNIE [11] model for entity matching at three types of hierarchical levels because it comprehensively analyses the embeddings of the hierarchical relations.
In Figure 6, the distribution histograms concerning three types of relations are presented with the corresponding hierarchies.These relations are chosen from the ICEWS, GDELT, Wikidata and YAGO datasets.Specifically, the comparison involves three groups of examples concerning three types of hierarchical relations.
1.As shown in (a) and (d) in Figure 6, the relations "is_a f f iliated_to" in the YAGO dataset and "cit y_of " in the GDELT dataset indicate that the head entities are at lower semantic hierarchies than the corresponding tail entities.2. As (b) and (e) in Figure 6, the entities linking the relations "make_a_visit" in the ICEWS dataset and "is_married_to" in the YAGO dataset are at the same semantic hierarchy.3.As shown in (c) and (f) in Figure 6, the relations "has_ part" and "has_cause" in the Wikidata dataset mean that this situation is exactly the opposite of (a) and (d) in the first item.
Similarly, we set h m and t m as each entry of head and tail entities, that is, the corresponding moduli are ||h m || and ||t m ||.Then, following [31], we formulate corresponding relations as follows: r m = h −1 m • t m .Therefore, we can obtain the modulus of relations as ||r m ||.Experientially, the moduli of ||r m || = 1 indicate the same hierarchy.Notably, a smaller modulus (the horizontal axis) indicates a lower semantic hierarchy.On this basis, we hope that a smaller model variance and a tighter distribution lead to clearer hierarchy modelling  6, our HTSE model (the blue one) corresponds to a much tighter distribution than DyERNIE (the orange one), which represents a smaller variance and precisely proves our expectation.In conclusion, this experiment demonstrates that our model can model hierarchical relations and distinguish corresponding entities better than a similar model.

Qualitative analysis
To demonstrate the explainable strategy for hierarchical embeddings in our model, we present some queries, corresponding candidates and requirements to evaluate the confirmed answer.Table 6 shows queries based on different hierarchies.For example, the query 1 means that 'Who is hosting the visit in Malaysia?', we may compare the candidate 'Barack Obama' with 'United States', then choose 'Barack Obama' as the confirmed answer because of the requirement 'Lower hierarchy and smaller moduli'.For query 2 and query 3, we confirm

Significance test
To show the significant advantage of HTSE, we set up a significance test experiment and obtained experimental figure.As shown in Figure 7, the centre of each horizontal segment, which is shown as triangles, is a representation of the average rank value.The horizontal segment is a critical value field.
Obviously, the horizontal segment and the highest rank of HTSE are distant from other methods with no intersection, which demonstrates that HTSE outperforms these baselines significantly.Furthermore, HTSE and DyERNIE preform better compared to the other three models.It demonstrates that the geometry-based model enhances the accuracy of temporal knowledge graph embedding.In addition, it can be shown that a few intersections between ChronoR and DyERNIE mean no obvious distinctions between the two models, and the previous analysis can show that DyERNIE still has some advantages.HyTE is obviously disadvantaged because of its incomplete projection in temporal knowledge graph embedding.

Conclusion and future work
In this work, a novel time surface-aware model for embedding is proposed to learn significant representations from TKGs.Our model employs a relation-oriented hierarchy modelling strategy to address the issue of ignoring semantic hierarchies.Another issue concerns the inaccurate semantic expressions caused by the limitations of incomplete projections.To address this issue, we use a time surface model with exponential mapping to enhance the representations of temporal characteristics.Experimental results indicate that HTSE achieves promising results and outperforms its geometric counterpart and other SOTA models.It demonstrates the advantages of utilizing surface-based spaces and hierarchical modelling for inference and prediction tasks in temporal knowledge graph embeddings.
However, the explanation of our model is not represented in this work and the causal relation between links is ignored in the embedding process.Therefore, we believe that enhancing the interpretability of temporal causal embeddings will be the focus of future research.

Figure 1
Figure 1 Principle of temporal knowledge graph embedding

Figure 2
Figure 2 Details of hierarchy modelling and projection on a time surface

Figure 3
Figure 3 Details of hierarchy modelling and projection on a time surface

Figure 4
Figure 4 Example of temporal knowledge graph embedding process

Figure 5
Figure 5 Results of temporal link prediction for future time stamps

Figure 6
Figure 6 Distribution histograms of hierarchical relation embeddings for several temporal datasets

Figure 7
Figure 7 Significance test

• The aims are different. DyERNIE
aims to learn multirelational data through dynamic relation embeddings.Although it can capture the geometric features in a KG, it ignores the explainable strategy for hierarchical embeddings.Instead, the HTSE aims to model hierarchical space and provides explicit evidence for temporal link prediction associated with different hierarchical levels when modelling hierarchical relations.•

The methods to model temporal information are different. DyERNIE aims
The main components and training process of the HTSE model are discussed in detail in Section 3. Extensive experiments are analysed and presented in Section 4. The paper concludes and provides an outlook for future development in Section 5.

Table 1
f (h, r , t, τ) Score function S k standard deviation r 0 , v 0 initial relation vector, initial curvature vector Q + τ , Q − τ valid quadruples, negative samples γ margin between valid and negative quadruples the embedding modulus of these head/tail entities.Yellow and purple dots denote entities belonging to different hierarchies.The red arrows indicate the hierarchical relations R i .This component completes the triple modelling and supports the transformation to quadruples with time stamps.

Table 2
The sizes of the categories contained in the five benchmark datasets

Table 4
Temporal link prediction performance on the datasets

Table 5
PG scores obtained for temporal information prediction

Table 6
Case study for a query based on a different hierarchy