A Reproducible Benchmark for P2P Retrieval
26 Pages
English
Downloading requires you to have access to the YouScribe library
Learn all about the services we offer

A Reproducible Benchmark for P2P Retrieval

-

Downloading requires you to have access to the YouScribe library
Learn all about the services we offer
26 Pages
English

Description

A Reproducible Benchmark for P2P RetrievalThomas Neumann Matthias Bender Sebastian MichelGerhard WeikumMax-Planck-Institut fur¨ InformatikJune 30, 2006Thomas Neumann (MPI fur¨ Informatik) A Reproducible Benchmark for P2P Retrieval 1 / 19Overview1. Motivation2. Setting3. Data Corpus and Queries4. Data Placement5. Experiments6. ConclusionThomas Neumann (MPI fur¨ Informatik) A Reproducible Benchmark for P2P Retrieval 2 / 19Motivation - P2P RetrievalPeer lists (directory)term a: 17, 11,92, ... url w: 7, 48, 21, ...term f: 43, 65, 92, … term c: 13, 92, 45, ...url y: 37, 44, 12, ...`term c: 13, 92, 45, ...Book- url x: 75, 43, 12, ...Marks B0Local index X0url v: 73, 105, 17, ...Thomas Neumann (MPI fur¨ Informatik) A Reproducible Benchmark for P2P Retrieval 3 / 19Motivation - CurrentI many papers about P2P retrievalThomas Neumann (MPI fur¨ Informatik) A Reproducible Benchmark for P2P Retrieval 4 / 19Motivation - CurrentI many papers about P2P retrievalTop 10 m ulticast1.00.8Threshold0.6TPUTTPUT+H ash0.40.20.0NLAN R−REAL WorldCup DEC−64 DEC−128 NLAN R−203 Berkley−512Thomas Neumann (MPI fur¨ Informatik) A Reproducible Benchmark for P2P Retrieval 4 / 19Tot al m essage sizeMotivation - CurrentI many papers about P2P retrievalGOV ,c=10%140,000DTA120,000TPUT100,000X−TPUT80,000 KLEE 3KLEE 460,00040,00020,0000234NumberofQuery TermsThomas Neumann (MPI fur¨ Informatik) A Reproducible Benchmark for P2P Retrieval 4 / 19Bandwidth ...

Subjects

Informations

Published by
Reads 26
Language English

Exrait

TohamNsPIf¨urIneumann(MeRA)dorpmrofkitahmnckfaribucBelela/1irveRPtero2P
Max-Planck-Institutfu¨rInformatik
June 30, 2006
A Reproducible Benchmark for P2P Retrieval
19
Thomas Neumann Matthias Bender Sebastian Michel Gerhard Weikum
nnamueNsru¨fIPM(mahoTbieldocumhraeBcnrmatInfoReprik)A
Data Placement
Experiments
Setting
Motivation
5.
4.
Conclusion
6.
192/
Queries
Data Corpus and
Overview
irtelaverofkRP2P
2.
3.
1.
arhmncBePRP2orkflaveirte
P2P
-
Motivation
91/3f¨ur(MPIrmatInfoeRrpkiA)bieldocu
Retrieval
mannsNeuhomaT
neBelbicudorpeRAtrRe2PrPforkmach(nPMmunasaeNhTmotik)ormarInfIf¨u9
I
Motivation - Current
papers
many
P2P
about
eiav4l1/
retrieval
roamIrfnRApeit)kumanasNeIf¨un(MPmohTl4/19
Motivation - Current
I
many
papers
about
P2P
retrieval
ofkrP2PrrteRaveiduroblcieneBmach
retrieval
9
Motivation - Current
I
many
papers
about
P2P
asNeThomieval4/1rP2PRetrhcamkroficlbBeneepARduromaork)tiu¨fIfnIrnamuPM(n
munasaeNhTmomaornfrI¨uIfMPn(lbicudorpeRA)kitBenehcamkrofPrP2Retrieval4/19
Motivation - Current
I
P2P
about
papers
many
retrieval
retrieval
lbicneBepeRAudormaork)ti¨uIfnfrI9
I
Motivation - Current
papers
many
P2P
about
eiav4l1/rP2PRetrchmarkfoNeasomThMPn(anum
fkro2PRPteirvela5/19
Motivation - Current
existing retrieval benchmarks (e.g. TREC):
Imany papers about P2P retrieval Iall have different experiments Ino standard collection, no standard peer construction, no queries Iexperiments are hard to reproduce Iespecially peer construction is unclear
Ionly centralized Iassignment to peers hard to justify Inot freely available
tamrofnIrpeRA)kileibucodarhmncBeohamTmannsNeuf¨ur(MPI
Tru¨fofnItamrA)kimahoeusNnnmaPI(Mfkromhrateir2PRPoducReprBencible196/alev
In particular specify: Isuitable data corpus Iassignment to peers Iqueries
Ifreely available, easily reproducible benchmark Iprefer real-world data Irealistic peer construction Istill allow for studying effects of overlap and peer size Inot a single data peer set, provide an algorithm
Motivation - Goals
l7/1ieva
1.distributed, no central authority 2.independent peers, collaborate only ad-hoc 3.peers acquire data autonomously 4.some loose thematic focus per peer 5.of graph structure (e.g. the web)data with some kind 6.queries correspond to general user interests
9
Setting - Assumptions
What is a realistic P2P retrieval setting?
PM(nnamueNsamohTk)timaornfrI¨uIfBeneiclborudRApeRetrrP2Prkfochma