|
|
|
![]() |
Topics in Privacy |
The following schedule and descriptions are tentative. Topics are usually not posted earlier than the week before.
[Talks Summer 2003] [Data Privacy Lab Projects] [Data Privacy Lab]
| Date | Description | |
|---|---|---|
| 1 | 6/1 | Anonymous Internet Communications Using Mix Networks (more) |
| 2 | 6/8 | The Art and Science of Privacy (more) |
| 3 | 6/22 | Erroneous Fingerprint Matches (more) |
| 4 | 6/29 | Learning the Criminal Relations Behind Scam Spam (more) |
| 5 | 7/6 | Outsourcing in Compliance with HIPAA and the Role of Privacy Technology (more) |
| 6 | 7/20 | Usable Privacy and Security Software (more) |
| 7 | 7/27 | Open Discussion (more) |
| 8 | 8/2 | The Economic Potentials of RFID/EPC in Retail Industry (more) |
| 9 | 8/9 | Sentiment Extraction from Unstructured Text using Tabu Search-Enhanced Markov Blanket (more) |
| 10 | 8/20 | Does privacy-preserving data mining have anything to do with privacy? (more) |
| 11 | 8/24 | Current Graduate Students Talk with New Graduate Students about Computer Science Research in Computation, Organizations and Society (more) |
Abstracts of Talks and Discussions
Every since Brooke Singer
(Carnegie Mellon and Data Privacy Lab 2002)
did her graduate thesis
in art on data privacy, the Lab has had several requests from museums for more work
that enables informal, interactive learning on privacy. Recently, another
opportunity for the Lab to acquire some funds to do such work emerged.
At today's brainstorming session, we will explore possible projects, mediums
to use, and the on-going intimate relationship between art, politics, and the
science of privacy. Of historical importance, as evidenced in part by
Brooke's work, are algorithms that enable others to put data fragments together
to learn sensitive information from disparate data sources. Is the world now
ready to move beyond sensational shock factors about what is available
and also learn about solutions?
Are on-line demonstrations better? What privacy topics should be examined?
What are the important issues to learn? What are the guidelines
or evaluation criteria by which these projects should be measured?
[Presenter: L. Sweeney]
In response to a May 22, 2004 Associated Press article by Andrew Kramer, entitled "Experts say erroneous fingerprint matches likely to become more common," discussion emerged on the local mailing list about the nature of fingerprint matching. Dr. Weedn identified two reasons beyond those expressed in the article as to why the error rate may climb: (1)the workload on examiners has escalated dramatically as they are called to do more background checks; and, (2) there are probably more errors than recognized and greater scrutiny will lead to the discovery of more cases. Dr. Weedn further points out that the article suggests that humans prevent errors and while this may be true, they also can inadvertently or not sanction a misidentification. Dr. Sweeney expressed concern over the use of minutae rather than the entire (or partial) image. She points out that a few point areas (12, 18 or 24) in a fingerprint makes the decision, and that seems even more suspicious as the number of fingerprints in the gallery grows to include so many in the population. This notion was challenged by the claim that more information may exist in the minutiae than in just the primary friction ridge pattern. With this start, the brainstorming session today will focus on current finger matching techniques and where weaknesses may reside. A copy of the article is available at
http://www.kgw.com/sharedcontent/APStories/stories/D82MR7D80.html.
[Presenter: Vic Weedn, MD JD]
Unsolicited communications currently account for over sixty percent of
all sent e-mail with projections reaching the mid-eighties. While much
spam is innocuous, a portion is engineered by criminals to prey upon,
or scam, unsuspecting people. The senders of scam spam attempt to mask
their messages as non-spam and con through a range of tactics,
including pyramid schemes, securities fraud, and identity theft via
phisher mechanisms (e.g. faux PayPal or AOL websites). To lessen the
suspicion of fraudulent activities, scam messages sent by the same
individual, or collaborating group, augment the text of their messages
and assume an endless number of pseudonyms with an equal number of
different stories. In this paper, we introduce ScamSlam, a software
system designed to learn the underlying number of criminal cells
perpetrating a particular type of scam, as well as to identify which
scam spam messages were written by which cells. The system consists of
two main components; 1) a filtering mechanism based on a Poisson
classifier to separate scam from general spam and non-spam messages,
and 2) a message normalization and clustering technique to relate scam
messages to one another. We apply ScamSlam to a corpus of
approximately 500 scam messages communicating the Nigerian advance
fee fraud. The scam filtration method filters out greater than 99\%
of scam messages, which vastly outperforms well known spam filtering
software which catches only 82\% of the scam messages. Through the
clustering component, we discover that at least half of all scam
messages are accounted for by 20 individuals or collaborating groups.
[Presenter: Edoardo Airoldi]
There has been a plethora of articles on outsourcing and drafts of
various bills placing limits on outsourcing are circulating around Congress. A lot of attention is primarily placed on offshoring and the loss of American jobs that may be related. However, there are many privacy concerns as well. This discussion will examine privacy problems related to the sharing of medical data for outsourcing and offshoring. Questions that will be explored include the following. What are the kinds of data and uses for outsourcing? What are the privacy risks? How do these relate to HIPAA (the new medical privacy legislation)? What role can technology and privacy technology play?
[Presenter: Latanya Sweeney]
Earlier this year, Lorrie Cranor launched the CMU Usable Privacy and Security Lab
(CUPS) . She also led a Workshop on Usable Privacy and Security Software
earlier this month. In this TIP session, Lorrie will be describing CUPS and work
in this area.More information about the new lab is available at
cups.cs.cmu.edu.
[Presenter: Lorrie Cranor]
This session is dedicated to discussing topics of interest to those who attend. Topics may include recent events in the press, general privacy research trends and/or concerns, or work currently underway. At least once or twice a semester, we try to have these kinds of open discussion sessions because they afford more topics to be covered in a session than possible when a full session is dedicated to one topic. Please attend and brainstorm with us.
The use of mobile information systems and communication technology furthers the informatization and automation of operating processes. In the retail industry, mobile communication technology affords new forms of customer communication by establishing electronic and individually designable channels of communication. In conjunction with powerful information systems, these options enable both improved services for customers and an increase in turnover and profit through more flexible and individualized pricing and communication policy. This contribution exemplifies the forms of pricing policy considering the "Extra Future Store" of METRO Group1 as case study and identifies two determinants of success for these new types of electronic customer communications: the power of disposal over the end device and the range of communication.
[Presenters: Dr. Jens Strueker and Dr. Stefan Sackmann,
Institute of Computer Science and Social Studies,
University of Freiburg]
Extracting sentiments from unstructured text has emerged as
an important problem in many disciplines. An accurate method would
enable us, for example, to mine on-line opinions from the Internet and
learn customers' preferences for economic or marketing research, or for
leveraging a strategic advantage. In this paper, we propose a two-stage
Bayesian algorithm that is able to capture the dependencies among
words, and, at the same time, finds a vocabulary that is efficient for
the purpose of extracting sentiments. Experimental results on the Movie
Reviews data set show that our algorithm is able to select a
parsimonious feature set with substantially fewer predictor variables
than in the full data set and leads to better predictions about
sentiment orientations than several state-of-the-art machine learning
methods. Our findings suggest that sentiments are captured by
conditional dependence relations among words, rather than by keywords
or high-frequency words.
More information and a copy of the paper available at
dataprivacylab.org/dataprivacy/projects/sentiment/.
[Presenter: Edoardo Airoldi]
Introductions and updates on recent work followed by open discussion and debated
about "privacy" in the body of work known as privacy-preserving data mining.
[Presenter: Chris Clifton, Purdue University]
The newest PhD Program in Carnegie Mellon's School of Computer Science is the
PhD Program in Computation, Organizations and Society (COS). In this session,
existing COS students talk about their research and the research process. COS students entering this Fall semester talk about their research interests and expectations. Others are invited to share research experiences with new and developing students.
More information about the new COS PhD program is available at
www.cos.cs.cmu.edu.
Related Data Privacy links