Tamás Gábor Csapó


dr-csapo-tamas-gaborE-mail:  
csapot[at]tmit.bme.hu

Research areas: articulation, text-to-speech, signal processing, human-machine interfaces.

 

Education

2014: PhD-degree: 2013, Thesis title: “Increasing the naturalness of synthesized speech in hidden Markov-model based text-to-speech synthesis” (in Hungarian)
Jan 2014–Jul 2014: Fulbright scholarship, Department of Speech and Hearing Sciences, Indiana University, Bloomington, IN, USA, Research topic: „Investigating tongue movement
during speech with ultrasound”.

2008–2013: PhD studies: program of Informatics at Budapest University of Technology and Economics (BME), Hungary. Fully funded by state scholarship. Summa cum laude, 100%.
2008: MSc-degree: 2008, Thesis title: “Implementation of prosodic variability in Text-To-Speech systems” (in Hungarian).

2003–2008: MSc Engineer in Technical Informatics, BME, Hungary, Major: Next Generation Networks, Cumulated Grade Average: 4.29/5.00
1999–2003: Kossuth Lajos High School (higher level maths class)

 

Academic Positions

2016-: part time research fellow, MTA

Nov 2014–: Assistant research fellow, Speech Technology and Smart Interactions Laboratory (SmartLab), Department of Telecommunications and Media Informatics (TMIT), Budapest University of Technology and Economics (BME), Hungary.

Jan 2014–Jul 2014: Visiting student researcher, Speech Production Laboratory, Department of Speech and Hearing Sciences, Indiana University, Bloomington, IN, USA.
2011–2014: PhD candidate, BME TMIT SmartLab, Hungary.
2008–2011: PhD student, BME TMIT SmartLab, Hungary.

 

Teaching Experience

2016–: Smart City laboratory, (in Hungarian), compiled and supervised a new lab on augmented reality applications. BME.
2015–: Software laboratory – databases, (in Hungarian), taught lectures and rated midterm work. BME.
2014–: Infocommunication, (in English), developed course material for English, taught lectures. BME.
2010–: Human-Computer Interaction, (in English and Hungarian), developed course material, assisted and taught lectures, rated mid-term project work.
2010–: Project Laboratory and thesis writing, (in Hungarian), supervised BSc and MSc students. BME and IU.
2008–2015: Measurement Laboratory, (in English and Hungarian), compiled and supervised a new lab on VoiceXML dialog planning and taught Speech Coding Lab. BME.

 

Supervising

Successfully defended:

2015: BSc-thesis: Zoltán Umlauf, Message handling on Android extended with speech technology, (Beszédtechnológiával kiegészített üzenetkezelő rendszer Androidon), BME TMIT.

2015: BSc-thesis: Dávid Csopor, Ultrasound-based tongue contour tracking with Deep Neural Networks, (Mély neuronhálók alkalmazása ultrahangos nyelvkontúr követésre), BME TMIT.

2012: BSc-thesis: Balázs Bárány, Speech driven remote control for SmartTV, (Beszéd alapú távvezérlő OkosTV-hez), BME TMIT.

2012: BSc-thesis: Barnabás Péter Weller, Design and application of acoustic icons, (Akusztikus ikonok tervezése és alkalmazása), BME TMIT.

2011: BSc-thesis: Roland Porció, Speech synthesis with spontaneous characteristics, (Spontán jellegű beszéd mesterséges előállítása), BME TMIT.

 

Awards, Scholarships, Grants

Jul 2016: NVidia Hardware Grant (Titan X GPU)

Mar 2016: Award, Ministry of National Development, Professional Medal for Information
Society.
Apr 2015: Award: 1st prize, Huszty Dénes Foundation, PhD Thesis Contest.
Jan 2014–Jul 2014: Fulbright scholarship, Indiana University, Bloomington, IN, Unites States of America.
Jan 2014: Grant: Hungarian Academy of Engineering, travel grant to Bloomington.
Oct 2013: Award: 3rd prize, BMe research grant, dissemination contest.
Jul 2013: Grant: Campus Hungary, travel grant to 8th Speech Synthesis Workshop.
May 2013: Award: 1st prize, Microsoft No Time to W8 contest, for the „Weather for All”
application (together with colleagues from BME-TMIT).
Apr 2010: Grant: Acoustical Society of America, International Student Grant.
Sep 2009: Grant: Bizáky Puky Péter Foundation, travel grant to Interspeech 2009.
May 2009: Award: 3rd prize, Audio Engineering Society (Hungary), MSc Thesis Contest.
Nov 2007: Award: 2nd prize, Scientific Students’ Associations Annual Conference of BME.
Nov 2007: Award: 1st prize, Scientific Students’ Assoc. Ann. Conf. of BME (in German).
Sep 2007–Aug 2008: Scholarship of the Hungarian Republic, awarded by the Ministry of Education of the Hungarian Republic.
Sep 2007–Jan 2008: Scholarship of the Faculty, Faculty of Electrical Engineering and Informatics, BME.
Sep 2007–Jan 2008: Scholarship of the University, Budapest University of Technology and Economics.
Aug 2007: Grant: International Speech Communication Association, travel grant to Interspeech 2007.
Apr 2007: Award: 1st prize, National Conference of Scientific Students’ Associations.
Nov 2006: Award: 1st prize, Scientific Students’ Association Annual Conference of BME.

 

Research Projects, Grants

2016–: MTA-ELTE Lendület Lingual Articulation Research Group – funded by the Momentum program of Hungarian Academy of Sciences, leader: Alexandra Markó

Jan 2014–Jul 2014: Fulbright scholarship, Indiana University, Bloomington, IN, Unites States of America. Supervisor: Dr. Steven M. Lulich.

 

Professional Activities:

Review-services: IEEE Signal Processing Letters, Journal on Multimodal User Interfaces, Intelligent Decision Technologies, International Journal of Speech Technology, IETE Technical Review, SPECOM (2016), RADIOELEKTRONIKA (2016), Interspeech (2013), CogInfoCom (2013)

 

Professional association memberships:

International Speech Communication Association
IEEE Signal Processing Society
Scientific Association for Infocommunications Hungary

 


Publications: (MTMT) Altogether 51 publications, of which

  • 17 journal papers (13 international, 4 national)
  • 23 conference papers (18 international, 5 national)
  • 3 book chapters (3 national)
  • 8 others

  1. Tamás Gábor Csapó, Tamás Grósz, Gábor Gosztolya, László Tóth, Alexandra Markó, DNN-based ultrasound-to-speech conversion for a silent speech interface, In: 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, Stockholm, Svédország, 2017, pp. 3672-3676 DOI cikk Scopus előadás
  2. Mohammed Salah Al-Radhi, Tamás Gábor Csapó, Géza Németh, Time-domain envelope modulating the noise component of excitation in a continuous residual-based vocoder for statistical parametric speech synthesis, In: Interspeech 2017, Stockholm, Svédország, 2017, pp. 434-438 DOI cikk Scopus
  3. Mohammed Salah Al-Radhi, Tamás Gábor Csapó, Géza Németh, Effects of adding a Harmonic-to-Noise Ratio parameter to a Continuous vocoder, In: UK Speech Conference 2017, Cambridge, Egyesült Királyság / Anglia, 2017, pp. 27-27 poszter pdf
  4. Mohammed Salah Al-Radhi, Tamás Gábor Csapó, Géza Németh, Deep recurrent neural networks in speech synthesis using a continuous vocoder, In: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, Hatfield, Egyesült Királyság / Anglia, vol. 10458 LNAI, 2017, pp. 282-291 DOI Scopus pdf
  5. Mohammed Salah Al-Radhi, Tamás Gábor Csapó, Géza Németh, Continuous vocoder in feed-forward deep neural network based speech synthesis, In: Digital speech and image processing, Novi Sad, Szerbia, 2017, pp. 1-4
  6. Markó Alexandra, Deme Andrea, Varjasi Gergely, Bartók Márton, Gráczi Tekla Etelka, Csapó Tamás Gábor, Word-initial irregular phonation as a function of speech rate and vowel quality in Hungarian, In: International Seminar on Speech Production, Tianjin, Kína, 2017, p. 2
  7. Kele Xu, Pierre Roussel, Tamás Gábor Csapó, Bruce Denby, Convolutional neural network-based automatic classification of midsagittal tongue gestural targets using B-mode ultrasound images, In: JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, vol. 141, no. 6, 2017, pp. EL531-EL537 DOI cikk WoS
  8. Alexandra Markó, Tamás Gábor Csapó, Karolina Takács, Listeners’ evaluation of voice quality in Hungarian speakers, In: BESZÉDKUTATÁS, vol. 25, 2017, pp. 55-66 DOI
  9. Tamás Gábor Csapó, Géza Németh, Milos Cernak, Philip N Garner, Modeling Unvoiced Sounds In Statistical Parametric Speech Synthesis with a Continuous Vocoder, In: 24th European Signal Processing Conference, EUSIPCO 2016, Budapest, Magyarország, 2016, pp. 1338-1342 DOI Scopus előadás cikk pdf WoS
  10. Milan Sečujski, Branislav Gerazov, Tamás Gábor Csapó, Vlado Delić, Philip N Garner, Aleksandar Gjoreski, David Guennec, Zoran Ivanovski, Aleksandar Melov, Géza Németh, Ana Stojković, György Szaszák, Design of a Speech Corpus for Research on Cross-Lingual Prosody Transfer, In: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, Budapest, Magyarország, vol. 9811, 2016, pp. 199-206 DOI Scopus WoS
  11. Kele Xu, Tamás Gábor Csapó, Pierre Roussel, Bruce Denby, A comparative study on the contour tracking algorithms in ultrasound tongue images with automatic re-initialization, In: JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, vol. 139, no. 5, 2016, pp. EL154-EL160 DOI Scopus WoS
  12. Bálint Pál Tóth, Tamás Gábor Csapó, Continuous Fundamental Frequency Prediction with Deep Neural Networks, In: European Signal Processing Conference (EUSIPCO 2016), Budapest, Magyarország, 2016, pp. 1348-1352 DOI Scopus előadás cikk pdf WoS
  13. Tamás Gábor Csapó, Géza Németh, Milos Cernak, Residual-based excitation with continuous F0 modeling in HMM-based speech synthesis, In: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, Budapest, Magyarország, vol. 9449, 2015, pp. 27-38 DOI Scopus hangminták előadás cikk pdf
  14. Tamás Gábor Csapó, Steven M Lulich, Error analysis of extracted tongue contours from 2D ultrasound images, In: Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH 2015), Dresden, Németország, 2015, pp. 2157-2161 kép poszter videos Scopus cikk pdf WoS
  15. Tamás Gábor Csapó, Géza Németh, Automatic transformation of irregular to regular voice by residual analysis and synthesis, In: Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH 2015), Dresden, Németország, 2015, pp. 613-617 kép poszter Scopus cikk pdf WoS
  16. Kálmán Abari, Tamás Gábor Csapó, Bálint Pál Tóth, Gábor Olaszy, From text to formants - indirect model for trajectory prediction based on a multi-speaker parallel speech database, In: Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH 2015), Dresden, Németország, 2015, pp. 623-627 poszter Scopus demo cikk pdf WoS
  17. Tamás Gábor Csapó, Géza Németh, Statistical parametric speech synthesis with a novel codebook-based excitation model, In: INTELLIGENT DECISION TECHNOLOGIES, vol. 8, no. 4, 2014, pp. 289-299 cikk Scopus
  18. Tamás Gábor Csapó, Géza Németh, Modeling irregular voice in statistical parametric speech synthesis with residual codebook based excitation, In: IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, vol. 8, no. 2, 2014, pp. 209-220 DOI cikk Scopus WoS
  19. Gy Szaszák, T Gábor Csapó, P N Garner, B Gerazov, Z Ivanovski, G Németh, B Tóth, Sečujski, and V Delić, The SP2 SCOPES project on speech prosody, In: Proceedings of DOGS2014 - Digital speech and image processing, Novi Sad, Szerbia, 2014, pp. 2-10 cikk
  20. António Teixeira, Annika Hämäläinenc, Jairo Avelar, Nuno Almeida, Géza Németh, Tibor Fegyó, Csaba Zainkó, Tamás Csapó, Bálint Tóth, André Oliveira, Miguel Sales Dias, Speech-centric Multimodal Interaction for Easy-to-access Online Services – A Personal Life Assistant for the Elderly, In: PROCEDIA COMPUTER SCIENCE, Vigo, Spanyolország, vol. 27, 2014, p. 9 DOI cikk Scopus WoS
  21. Tamás Gábor Csapó, Géza Németh, A novel irregular voice model for HMM-based speech synthesis, In: ISCA 8th Speech Synthesis Worksop (SSW8), Barcelona, Spanyolország, 2013, pp. 229-234 cikk hangminták előadás
  22. Tamás Gábor Csapó, Increasing the naturalness of synthesized speech (PhD summary), In: PHONETICIAN, vol. 105-106, 2012, pp. 88-97 honlap cikk pdf
  23. Tamás Gábor Csapó, Géza Németh, A novel codebook-based excitation model for use in speech synthesis, In: Cognitive Infocommunications (CogInfoCom), Košice, Szlovákia, 2012, pp. 661-665 DOI kép Scopus előadás video cikk Google scholar pdf WoS
  24. Éva Székely, Tamás Gábor Csapó, Bálint Tóth, Péter Mihajlik, Julie Carson-Berndsen, Synthesizing Expressive Speech from Amateur Audiobook Recordings, In: IEEE Workshop on Spoken Language Technology, Miami (FL), Amerikai Egyesült Államok, 2012, pp. 297-302 DOI cikk Scopus pdf WoS
  25. Gráczi TE, Lulich SM, Csapó TG, Beke A, Context and speaker dependency in the relation of vowel formants and subglottal resonances : Evidence from Hungarian, In: Interspeech 2011, 12th Annual Conference of the International Speech Communication Association, Firenze, Olaszország, 2011, pp. 1901-1904 kép poszter Scopus cikk pdf WoS
  26. Géza Németh, Gábor Olaszy, Tamás Gábor Csapó, Spemoticons: Text-To-Speech based emotional auditory cues, In: ICAD 2011, Budapest, Magyarország, 2011, pp. 1-7 cikk Google scholar pdf
  27. Tamás Gábor Csapó, Csaba Zainkó, Géza Németh, A Study of Prosodic Variability Methods in a Corpus-Based Unit Selection Text-To-Speech System, In: INFOCOMMUNICATIONS JOURNAL, vol. LXV, no. 1, 2010, pp. 32-37 cikk pdf
  28. Csapó TG, Bárkányi Zs, Gráczi TE, Bőhm T, Lulich SM, Relation of formants and subglottal resonances in Hungarian vowels, In: 10th annual conference of the International Speech Communication Association 2009 (INTERSPEECH 2009), Egyesült Királyság / Anglia, 2010, pp. 484-487 kép poszter Scopus cikk pdf WoS
  29. Csaba Zainkó, Tamás Gábor Csapó, Géza Németh, Special Speech Synthesis for Social Network Websites, In: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, vol. 6231, 2010, pp. 455-463 DOI kép Scopus előadás cikk Google scholar WoS
  30. Géza Németh, Tamás Gábor Csapó, Bálint Tóth, Improving the Quality of Unit Selection and HMM based Speech Synthesis, 2009 link
  31. Csapó TG, Gráczi TE, Bárkányi Zs, Beke A, Lulich SM, Patterns of Hungarian vowel production and perception with regard to subglottal resonances, In: PHONETICIAN, vol. 99-100, 2009, pp. 7-28 honlap cikk link
  32. Németh G, Fék M, Csapó T G, Increasing Prosodic Variability of Text-To-Speech Synthesizers, In: Interspeech 2007, Antwerpen, Belgium, 2007, pp. 474-477 poszter hangminták Scopus cikk Google scholar WoS