Enhancing data quality with effective feature selection and privacy protection
Lu-Yao WANG , Zhu-Sen LIU , Qi FENG , Wei-Bin WU , Lu ZHOU
Front. Comput. Sci. ›› 2026, Vol. 20 ›› Issue (3) : 2003607
Enhancing data quality with effective feature selection and privacy protection
Privacy-preserving feature selection allows identifying more important features while ensuring data privacy, thus enhancing data quality. Secure multiparty computation (MPC) is a cryptographic method that allows effective data processing without a trusted third party. However, most MPC-based feature selection schemes overlook the correlation between features and perform poorly for model training when handling datasets containing both numerical and categorical attributes. This paper proposes a feature selection scheme, MPC-Relief, to select the relevant features while preserving privacy. To achieve safety under MPC, we transform all complex computational steps from data-dependent to data-oblivious with faithful implementations. In detail, we construct bidirectional vectors to partition subsets and propose an MPC-based nonlinear function, MN-Ramp, to calculate the difference between mixed attributes. Besides, we apply a mapping method for the distance calculation to eliminate the need for conditional judgments. We evaluate the computational and communication overhead of the MN-Ramp function in both WAN and LAN environments and validate its effectiveness across various datasets. The comparative analysis demonstrated that our scheme achieves up to an 18% accuracy improvement over other schemes when handling nonlinear datasets. The results of the classification task based on the selected features indicate that our scheme notably enhances the performance of subsequent models while ensuring strong privacy security guarantees.
feature selection / secure multiparty computation / machine learning as a service / secret sharing
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
|
| [25] |
Li A, Peng H, Zhang L, Huang J, Guo Q, Yu H, Liu Y. FedSDG-FS: efficient and secure feature selection for vertical federated learning. In: Proceedings of IEEE INFOCOM 2023 - IEEE Conference on Computer Communications. 2023, 1–10 |
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
| [35] |
|
Higher Education Press
/
| 〈 |
|
〉 |