1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.
Contributed by Ms. Wenting Shen, Prof. Jing Qin, and Prof. Jiankun Hu. Based on the article “Enabling Identity-Based Integrity Auditing and Data Sharing With Sensitive Information Hiding for Secure Cloud Storage,” the original article is open access and freely available for download on IEEE Xplore®.
Dr. Jiankun Hu and Wenting Shen also presented this topic in a 2020 SPS Education webinar. Visit the recorded webinar at the SPS Resource Center.
With the explosive growth of data, it is a heavy burden for users to store the sheer amount of data locally. Therefore, more and more organizations and individuals would like to store their data in the cloud. However, the data stored in the cloud might be corrupted or lost due to the inevitable software bugs, hardware faults and human errors in the cloud. Thus, it is necessary to verify the integrity of the data stored in the cloud.
In remote data integrity auditing schemes, the data owner first needs to generate signatures for data blocks before uploading them to the cloud. These signatures are used to prove the cloud truly possesses these data blocks in the phase of integrity auditing. And then the data owner uploads these data blocks along with their corresponding signatures to the cloud. The data stored in the cloud is often shared across multiple users in many cloud storage applications, such as Google Drive, Dropbox, and iCloud. Data sharing as one of the most common features in cloud storage, allows a number of users to share their data with others. However, these shared data stored in the cloud might contain some sensitive information. For instance, the Electronic Health Records (EHRs) stored and shared in the cloud usually contain patients’ sensitive information (patient’s name, telephone number, ID number, etc.) and the hospital’s sensitive information (hospital’s name, etc.). If these EHRs are directly uploaded to the cloud to be shared for research purposes, the sensitive information of the patient and hospital will inevitably be exposed to the cloud and the researchers. Besides, the integrity of the EHRs needs to be guaranteed due to the existence of human errors and software/hardware failures in the cloud. Therefore, it is important to accomplish remote data integrity auditing on the condition that the sensitive information of shared data is protected.
A feasible method of solving this problem is to introduce a sanitizer to the data blocks corresponding to the sensitive information of the file. Specifically, the user blinds the data blocks corresponding to the personal sensitive information of the original file and generates the corresponding signatures, and then sends them to a sanitizer. The sanitizer sanitizes these blinded data blocks into a uniform format and also sanitizes the data blocks corresponding to the organization’s sensitive information. Generally, these sanitized data blocks are replaced with wildcards. Furthermore, the sanitizer transforms the sanitized data blocks’ corresponding signatures into valid ones for the sanitized file. This method not only realizes the remote data integrity auditing, but also supports the data sharing on the condition that sensitive information is protected in cloud storage.
Here, we give an illustrative example for EHRs. In this example, the sensitive information of EHRs contains two parts. One is the personal sensitive information (patient’s sensitive information), such as patient’s name and patient’s ID number. The other is the organization’s sensitive information (hospital’s sensitive information), such as the hospital’s name. Generally speaking, the above sensitive information should be replaced with wildcards when the EHRs are uploaded to cloud for research purpose. The sanitizer can be viewed as the administrator of the EHR information system in a hospital. The personal sensitive information should not be exposed to the sanitizer. And all of the sensitive information should not be exposed to the cloud and the shared users. A medical doctor needs to generate and send the EHRs of patients to the sanitizer for storing them in the EHR information system. However, these EHRs usually contain the sensitive information of patient and hospital, such as patient’s name, patient’s ID number, and hospital’s name. To preserve the privacy of patient from the sanitizer, the medical doctor will blind the patient’s sensitive information of each EHR before sending it to the sanitizer. The medical doctor then generates signatures for this blinded EHR and sends them to the sanitizer. The sanitizer stores these messages into the EHR information system. When the medical doctor needs the EHR, he sends a request to the sanitizer. And then the sanitizer downloads the blinded EHR from the system and sends it to the medical doctor. Finally, the medical doctor recovers the original EHR from this blinded EHR. When this EHR needs to be uploaded and shared in the cloud for research purpose, in order to unify the format, the sanitizer needs to sanitize the data blocks corresponding to the patient’s sensitive information of the EHR. In addition, to protect the privacy of hospital, the sanitizer needs to sanitize the data blocks corresponding to the hospital’s sensitive information. As noted earlier, these data blocks are replaced with wildcards, and the sanitizer can transform these data blocks’ signatures into valid ones for the sanitized EHR. It makes the remote data integrity auditing still able to be effectively performed. During the process of sanitization, the sanitizer does not need to interact with medical doctors. Finally, the sanitizer uploads these sanitized EHRs and their corresponding signatures to the cloud. In this way, the EHRs can be shared and used by researchers, while the sensitive information of EHRs can be hidden. And most importantly, the integrity of these EHRs stored in the cloud can be ensured.