Spark iForest is designed and implemented easy to use. The usage is similar to the iForest sklearn implementation [3].
In addition, pyspark package is also provided. More details and usage can be found in python folder.
- *numTrees:* The number of trees in the iforest model (>0).
- *predictionCol:* Prediction column name, default "prediction".
## Examples
The following codes are an example for detecting anamaly data points using
If you encounter any bugs, feel free to submit an issue or pull request. Also you can email to:
<a href="fangzhou.yang@hotmail.com">Yang, Fangzhou (fangzhou.yang@hotmail.com)</a>
## Citation
Please cite spark-iforest in your publications if it helped your research. Here is an example BibTeX entry:
author={Yang, Fangzhou and contributors},
## References:
[1] Liu F T, Ting K M, Zhou Z, et al. Isolation Forest[C]. international conference on data mining, 2008.
