This section discusses augmented reality tracking for indoor navigation and mobile computing. This section explains the existing AR techniques and their constraints.
Indoor tracking is still one of the main challenges facing augmented reality applications. Low Chee Huey et al. implemented the initial indoor technique (Huey, 2011), using laptops and USB webcams to gather camera frames. They used ARToolkit to detect a marker and compare it to previously stored markers in the database. When a marker is identified at a precise position, it is converted to the location ID for processing. The OpenGL API loads the VRML model using the marker’s coordinates. The route planner module determines the shortest path between the current and destination locations.
To reduce the difficulties associated with utilizing a laptop for indoor tracking, a Raspberry Pi was used in place of the laptop. Muhammad Fadzly et al. implemented a system composed of a web camera, a Raspberry Pi, display glasses, and an input controller (bin Abdul Malek, 2017). Each component connected to the Raspberry Pi has a unique set of capabilities. The client initiates the execution of the application by specifying the desired location, at which point the camera begins capturing live images to detect area markers. The system is shown in Fig. 1.
This diagram is updated from (Huey, 2011), which describes the architecture of the Augmented reality marker indoor navigation system.
Currently, most applications aim to replace marker-based navigation with image recognition-based navigation. Jia- Hua Wu et al. implemented an approach consisting of three parts: the server, the user, and the administrator (Wu, 2020). They used the server to store databases used to create the indoor map. The administrator will need to create a map of the internal space and the surrounding feature points. They used KNN nearest neighbour algorithm in their approach to training their module. They uploaded trained characteristics of the images for an internal place to the backend and used the Yolo v3 object detection method to extract characteristics from real-time tracking. The Yolo v3 extracted characteristics are compared with the training characteristics to determine their current location (Huang R. a., 2019). The images were then binarized to remove extraneous numbers from the final result. Finally, the users scan their surroundings and enter their destination, and then the indoor system is used directly. The A* algorithm assists them in calculating the shortest path from their starting point to their final destination.
Indoor navigation system based on NFC, has become more popular (Ozdenizci B. a., 2011). The primary source of the idea is a smartphone with an integrated NFC component and an application running on it, which helps to guide the user (Deak, 2012) (Ozdenizci B. a., 2015). The mobile device connects to the URL Tag by touching it, and the map-server responds by sending the map information to the mobile device, which is then loaded onto the mobile device. Following the loading of the map, the indoor navigation system turns the map data into a link-node model, which is a 2D network with topological relationships. The user then selects a destination point, and the application uses Dijkstra’s shortest path algorithm to compute the most efficient route rapidly. Data from NFC tags are collected to determine the current location. The user is only required to touch the mobile device to a tag on his way to validate his navigation.
Chiaki Takahashi and Kazuhiro Kondo developed an indoor navigation system using beacons (Takahashi, 2015). Mobile devices can estimate their distance to the beacon based on the signal strength (Campaña, 2017). Through the use of Bluetooth Low Energy, beacons transmit advertising signals to smartphones (BLE). The beacons are placed in the corners of the area that use to construct a database of radio wave strength. Then, they used the database to match the detected intensity at an unknown site to determine the current position.
Wondimu K. Zegeye et al. developed an indoor localization system based on WIFI-RSS Fingerprinting (Zegeye, 2016). This technique is not dependent on any particular features or beacons. The indoor environment is represented by the system using a grid-based (Brevi, 2009), representation. Experiments were conducted using the building’s pre-existing WIFI infrastructure. The offline phase of this method requires the use of sampled RSS values to create a radio map of the area under consideration. To build radio maps, the application scans and collects information about reachable access points (APs) at each sampling location, including their RSS values. Succeeding then, the localization algorithm estimates the device’s position during the online phase. This algorithm’s response time in a freestanding architecture is approximately 220 ms.
Weilin Xu et al. developed the pedestrian tracking algorithm to improve WIFI- grid-based indoor model (Xu, 2018). Indoor space is subdivided into grid cells of a defined size and semantics. The pedestrian algorithm repeatedly predicts that the probability of being located above these cells is dependent on indoor and magnetometer measurements toward a mobile cell. The grid filter is a Bayesian discrete filter that probabilistically calculates the position of a target based on sensor measurements. The tracking system uses the Markov chain model (Hayes, 2013), to determine its position over time.
Edward et al. progressed to build the indoor navigation system (WPIN) using Bluetooth low energy (BLE) beacons called Lbeacons deployed at each indoor building intersection (Chu, 2019). Users were provided with 2D pictures indicating their direction along the path to the destination, such as turn left, turn right, and straight. The WPIN application comprise of three layers: the user interface, navigation, and indoor positioning. The navigation module used to detect the user’s location after receiving input, including the destination. Within the positioning module, the BLE advertising messages are received from various Lbeacons. Each advertising message contains the sender’s coordinates. WPIN application determines the user’s position by selecting the message with the greatest RSSI (Received Signal Strength Indicator) value among the filtered messages. If the user arrives at a new waypoint, the WPIN application displays a direction indicator on the screen, such as turn left, turn right, or continue straight. Until the user reaches the destination, the positioning-navigation procedure is paused. The architecture of WPIN application is shown in Fig. 2.
This diagram is adapted from (Chu, 2019), which describes the architecture of WPIN application.
The ARBIN (Huang B.-C. a.-H.-M., 2020), an augmented reality-based navigation system developed, extends the previous work, WPIN. The architecture of ARBIN and WPIN looks the same based on Bluetooth beacons. However, due to the limitations of a 2D navigation map, users may experience mental strain and become confused as they attempt to connect the real surroundings and the 2D navigation map before proceeding. As a result, they developed ARBIN, an augmented reality-based navigation system that conveniently displays navigation directions on the screen of real-world environments by Google ARCore.
Indoor tracking systems are commonly implemented using marker, communication, or image detection technologies. We discussed briefly how each indoor tracking system is implemented. From my perspective, after reading about how all systems work. When the marker image is prepared appropriately, the marker-based system is stable. The marker-based approach is ineffective; when the mobile camera is moved away from the marker, the augmented reality experience is lost, and the marker photo must be scanned again. Scanning will not operate in some scenarios where markers reflect light. The communication-based system is widely applicable, computationally efficient, and can integrate with two- and three-dimensional maps. Usually, communication-based technology is unsuitable for applications requiring highly accurate tracking. A system based on image detection technologies can achieve great positioning accuracy. Not only can image detecting technology output the position, but also the view angle. The current limitation of image detection methods for indoor tracking is that they rely on pre-loaded maps, and low light levels reduce their accuracy. On the other hand, tracking image detection technology is computationally intensive and requires high performance hardware. The comparison of common features in indoor position systems is shown in Table I.
Table I. Comparison of common features in indoor position systems
Paper
|
Build Depend on
|
Client platform
|
Area Size
|
Accuracy
|
(Huey, 2011)
|
- Marker, ARtoolkit, Open GL, USB webcams and laptops.
|
laptops
|
medium
|
NA
|
(bin Abdul Malek, 2017)
|
- Marker, ARtoolkit, OpenGL,
Raspberry pi and USB webcams
|
glasses
|
medium
|
∼100%
|
(Takahashi, 2015)
|
- iBeacons and local server
|
smartphone
|
Small
|
NA
|
(Chu, 2019)
|
- Lbeacons, navigation module, positioning module and 2D pictures indicating
|
smartphone
|
large
|
92.5% 3–5 m
|
(Huang B.-C. a.-H.-M., 2020)
|
- Lbeacons, navigation module, positioning module and 3D arrow model (in real-time)
|
smartphone
|
large
|
92.5% 3–5 m
|
(Ozdenizci B. a., 2015)
|
- NFC Tags, Map Server, Link node model and Dijkstra’s shortest path algorithm
|
smartphone
|
medium
|
∼100%
|
(Zegeye, 2016)
|
- WIFI (Access points), radio map, server and localization algorithm
|
smartphone
|
medium
|
80% − 5m
|
(Xu, 2018)
|
- WIFI grid based, pedestrian tracking algorithm, Kalman filter and Markov chain model
|
smartphone
|
large
|
92% − 3.5m
|
(Wu, 2020)
|
− 2D map, server, Yolo v3 object detection method, KNN nearest neighbor algorithm and A* algorithm
|
smartphone
|
medium
|
NA
|