Figure 1. Steps of the KDD
[3] Process.
The appearance of these rules has started
with the use of large databases in commercial transactions known as Big Data
[19] [20]. This is more generally known today as the classic problem of “market-basket
of the housewife” which explains the origin of rules
associations. This is perhaps best illustrated by the relation˝ if
condition then result˝. The question then is to determine for each basket
purchased by a customer, expressing his needs and preferences, the rules on which
the supermarket should focus to manage efficiently all the customer baskets
simultaneously. It will then be necessary to coordinate very important mass of
information to be able to define, analyze and extract relevant information for
that target.
In our case it will be necessary to
associate individual characteristics of the unemployed to determine duration of
unemployment related to them, and at the same time to be able to react on pre-established
rules decision by selecting actions to reduce it, and consequently the
probability of outflows of unemployment. Using KDD
[4] techniques, for such massive detailed individual database, translated into
relevant and synthetic information using “Data Mining procedure”
[5]should
make us then reach our target.
The result will then be expressed by
associative algorithms and will eventually give simplified computer model for
very complex rules associations. Data mining technique generally include
different associations and fields from self-learning, statistics, knowledge
representation, artificial intelligence, expert systems. It is an iterative and
interactive analysis scheme which uses raw data to extract relevant and easily proceedable
and reliable information by the analyst [21].The interactive process generated
by Data Mining reflects the way that human analyst Could analyse, control, and
take corrective decisions. The analyst takes the charge of reprocessing the
information to extract the most interesting. This must be preceded by a pre-selection
that makes search on data easier. Then the procedure will consist on applying
artificial intelligence methods to determine the best algorithmic model. At
this level, various graphic and numerical visualization methods will enable the
various results to be presented and evaluated effectively, and so main
conclusions could be drawn.
The place of the expert analyst (the human) is
supposed and expected to be essential and major in that procedure. Instead of
that the reality is quite different. It is important here to distinguish
between Data Mining process (DMP) and human-machine interaction (HMI). [22] and
[23] has shown that DMP is rather based on task-oriented systems whereas HMIs
are limited to define them. The role of human is then divided into specific
functions for a definitive application. These specific functions represent the
basis of the knowledge system. The accomplishment of a task achieves the specific
objective associated. This function can be subdivided according to its
specificity. At this point the intervention of the expert may be determinant or
even necessary. The main purpose here is to realise a tool capable to determine
the duration of the episode of unemployment according to the profile of the
candidate and aiming to forecast and estimate that duration. The major goal is
to reduce the flow of jobless people and help the public intermediary to take
immediate decision able to reduce their duration thanks to the use of simulator
of unemployment.
2.2 Empirical approach
To establish rules on which the simulator
should act to correctly determine the duration of unemployment and individual
recommendations, we have to determine in advance a model that will express the
best the individual behavior of the tunisian unemployed according to their
characteristics and attributes.
A. The setting of the context
My initial base over the period from January 2010 to September 2015
includes, after filtering outliers, 206.409 male registrants (49.7%), 208.156
female registrants (50.3%). There are also 323.3667 unmarried (78%), 78.803
married (19%), 12.442 widowed (3%) and 37.328 divorced (9%). Our territorial database has 24
governorates (which will eventually go to 7 basins) and 256 delegations. There
is also the presence of 75 types of initial diplomas (they will follow a
grouping in 5 modalities) and 6 levels of schooling. It also includes
registration dates and position dates (less than 60 days), registered
unemployed persons (active and passive individuals).we do classified
individuals by date of registration, date of position, age, sex, governorate
(area), delegation (district), marital status, level of education, diploma, and
specialties (other details were eliminated for the purposes of analysis, and an
intelligent choice of most interesting and useful and discriminating
[6] variables).
Then in order to get the best insights from
qualitative variables that form the basis of my work on duration of
unemployment, a Multiple Factor Component Modelling (MCA) is the best model
that comes to mind [24].
B. Alternative of Discrete Choice Models
The evolution of econometrics has made
possible to go from an aggregated macroeconomic analysis of data to a
microeconomic one of the individual attributes thanks to the development of
computerized data processing and the various savings in time and money it
allows[25][26]. Thus, in addition to the traditional quantitative statistical
data that are usually processed, there is now a treatment of qualitative, more
complex and heterogeneous characteristics (gender, socio-professional category,
geographical affiliation, type of education, Contrary to being unemployed,
etc.).Initially, traditional statistical methods only allowed modelling and
analysis of quantitative characteristics. More specific methods have been
developed and used since, allowing to take into account the absence of
continuity of the variables treated, or the absence of a natural order between
the modalities, in its qualitative dimension [27][28]. It is these specific
methods that will be the subject of my work on the interest and the statistical
significance of the qualitative variables, often neglected or omitted.
Historically, the study of models
describing the modalities of one or more qualitative variables began in the
1940s. The most relevant research was Berkson's (1944-1951) research, including
simple dichotomous models known Name of logit and probit models. The first
empirical validations concerned mainly various sciences ranging from sociology
to psychology through physics. It was not until the 1970s that the first
attempts of these models in economics and political science were made, thanks
in particular to the articles by MacFadden (1974) and James J. Heckman
(1976).The modelling proposed by these authors has provided a framework for the
application of econometric techniques of qualitative variables for the
resolution of economic problems. This has made it possible to improve the
interpretation of simple models of use and information and Synthetic material
(logit).It was even later developed a mid-qualitative and semi-quantitative
intermediate model (Tobit model), of interest and a certain contribution, for
complex and diversified problems.
Thus a qualitative character that can take
K different modalities. When K takes the modality 2, the latter is called
dichotomous. Example: to be unemployed or not to be unemployed. In the general
case, where K can take a number of modalities greater than 2, the variable is
then called polyatomic. It is at this level that the difficulty of modelling
econometrically a qualitative variable of this kind appears. Hence the
advantage of switching to discrete-choice modelling allows us to answer and
find detailed analytical solutions to our complex problem of duration. Then we
will proceed to the application of the multi-mapping analysis MCA with the aim
of having a first idea on the grouping of variables to be associated with the
duration. We will then continue with a modelling in multinomial logit (ML) to
try to express the duration of unemployment according to the different
modalities, which explain it in probabilistic terms, which should allow us to
determine the weight of the modalities (explanatory variables) in the duration
of unemployment (explained variable).
Therefore we will go further trying to
aggregate (ML)in to ordinal logit (OL) trying to express the unemployment
duration according to the different modalities of the variables which explain
it the best in probabilistic terms, and thus be able to compare the different
results of discrete-choice modelling of qualitative variables.
C. Management of censorship and truncation
Our sample is limited to the observation
period from 1st January 2006 to 30September 2015. As a result, some
work stoppage may not be observed in full. A second censorship concerns input
and output flows during the study period and should be taken into account in
the analysis and modelling. Considering only the observed data in their entirety, is not the best
solution since censors and truncations affect the likelihood, and the estimated
parameters. The estimators would then be biased.
Figure 1. censorship and truncation
Intuitively we can say that if we consider
censorship as a time of survival, the law of maintenance will be
underestimated. As well if we omit truncation exit rates will be overestimated.
D. MCA analysis
Figure 2. The most correlated variables to the
unemployment duration
We do proceed to the MCA model in order to
have an initial idea about the correlated qualitative variables (correlation
ratio of the variables that make up the dimensions of the MCA). The ultimate
goal is to find the variables that are the most correlated (close) to the
duration (graphically) and therefore which could explain it the best. This is,
however, not very statistically significant (we speak of individual qualitative
variables) and will require a shift logit model to confirm or invalidate the
results found.
We will thus review the representations of
the active variables (formative and discriminating variables of the modelling)
to that of the qualitative illustrative variable (the duration of unemployment
for our case).Thus, what is taken from this figure:
• Gender and marital status are not graphically close
to the duration, and thus a priori do not fall into the discrimination of the
latter.
• Respectively, the governorate of residence (and / or
membership of the candidates) and age (in the first place) appear to be the
variables most correlated with duration.
• Level and diploma are also quite explanatory, and
therefore related to duration, but of second importance in terms of proximity.
This figures although quite interesting remains
incomplete and should be verified by an appropriate modelling for this type of variables
(qualitative). The model that seems the most recommended is that of modelling
the qualitative variables in discrete choices. This is reflected in some
research that has attempted to model at best the transitions and probabilities
of transitions in the job market. The synthesis of all these works, and in particular the work of Daniel
L. MacFadden (1974) and James J. Heckman (1976)[29][30] allow us to retain two
types of modelling which should allow us to represent and discriminate the
duration of unemployment: the logit or the probit. A strong similarity between the two interaction models
has been known for some time (Anas 1975 and Williams 1977) but the magnitude of
this similarity has been underestimated. Our work here will try to prove that
the two approaches are identical from the point of view of the results of the
modelling. In the multinomial logit model we could obtain different modelling
for different levels of modalities. The only concern is that one cannot
aggregate towards a single modelling, or the statistical literature, remains
silent on this problematic of aggregation, or even that excludes aggregation
this type of modelling. In this context, logit modelling (less constraining and
more general than probit modelling) is proposed for qualitative variables that
should be coded because of the highly heterogeneous nature of the variables and
modalities that need to be coded appropriately. From this we can compare a
modelling with several modalities of the variables (multinomial) and a more
aggregated modelling (ordinal) to draw the analyzes and conclusions.
E. Switching to the multinomial logit (ML)
Table1. Results of modelling in multinomial logit | Variables | Criteria for fitting the model | Likelihood Ratio Tests | Log-likelihood of the model | Khi-deux | Degrees of freedom | Signif. | Constant | 84344,514a | ,000 | 0 | . | MaritalStatu | 84351,528b | 7,014 | 12 | ,857 | GouvResidence(3) | 86799,572b | 2455,057 | 92 | ,000 | Gender(2) | 87426,736b | 3082,222 | 4 | ,000 | Diploma(4) | 86815,788b | 2471,274 | 16 | ,000 | Level(5) | 85092,749b | 748,234 | 16 | ,000 | Age (1) | 90452,300b | 6107,786 | 20 | ,000 | Specialities(6) | 84468,969 | 124,455 | 28 | ,000 | YearsDiploma(7) | 84676,538b | 332,024 | 20 | ,000 |
|
The modelling in multinomial logit under
SPSS 21 and Sub R converges towards a significance of all variables explaining
the duration of unemployment, with the exception of marital status. Thus, age
seems to be the most important variable that explains duration (as evidenced by
the highest chi-square) followed by gender, the governorate of residence (area),
diploma, year of diploma and specialties. This first important result tries to
discriminate duration according to the other variables characteristic of
unemployment, is a an additional confirmation of what has been advanced by number
of authors having treated the subject. Indeed, the table in appendices provides
various and varied modalities or the significance of the modalities of
variables varies, with the exception of the variable of the marital status,
which appears not significant, whatever the modality. The first model allows us to have an initial idea
about the discriminating qualitative variables of the unemployment duration.
However, the various modalities used do not allow a generalization and
aggregation. This empirical constraint imposes a transition towards a more
adapted and more aggregated modelling: the ordered logit.
F. Aggregation by ordinal logit (OL)
Modelling an ordered logit is another model
chosen by many authors of discrete choice models with respect to the modelling
of qualitative variables
[7]. The application of the ordinal logit to our
database will allow us to obtain, contrary to the previously calculated
multinomial logit, an aggregated modelling of the significant variables that
affect the duration of unemployment. We also choose to join the 24 governorates
to 7 regional areas in order to avoid the Multicollinearity problem (areas of
east and west of north, center and south). We begin first by applying the tested
logit (10% of the global base) and then to its application on the basis of its
entirety in order to be able to verify and analyze the results which result
from it.
Table2. Results of sample test in ordinal logit
model
\(Yduration=-0.4649897Xsex+0.018517XAge+0.3296339XSecondary+0.4270708XSecondaryprofessional+0.9734383XHigher+0.2467303XEastCenter-0.1452191XwestCenter-0.382215XwestNorth-0.7520418XEastSouth\)outh |
The
aggregation through the sample tested ordinal logit is :
As shown by the results of the modelling
under STATA 12 for the test sample, the duration is significantly and
positively correlated with all the variables of the base, except for the
feminine gender, the belonging to the Center west, North west and South west
regional areas with which the duration is certainly significant but negatively
correlated. The only exception is marital status
(married) which is not significant and discriminated to duration. The duration
of unemployment is inversely proportional to female candidates. This is a
common feature for many underdeveloped countries such as ours, as reported by
few publications or articles on the subject [31]. Second observation concerns
the belonging of the unemployed candidates to Central West, North West and
South East. This is quite predictable for these regions of western and southern
Tunisia which have been historically marginalized for decades, testifying about
the strong attachment of the unemployed to these regions (West and South) as
well as showed by the index of regional development of the country far from the
rich and prosperous north and central regions. The explanation of these results
could be related to the low population of these regions, the development of the
black market with Algeria on the one hand and Libya on the other, the low rate
of urbanization (large governorates such as Kebili for example with a small
number of inhabitants), the lack of employment offices, the discouragement of
young people, the low rate of direct investment, the lack of State programs and
so on. In addition to that the modalities for marital status are positively and
significantly correlated with duration. Especially the single, widowed and divorced
status which rises with duration, contrary to the married modality. Thus, it
seems that the fact to be married does not help to find a job more easily.
For our database, we were surprised by the
fact that despite that the majority of registered unemployed are illiterates,
this modality is not correlated or does not discriminate the duration of
unemployment. Regarding the variable level of schooling, all the modalities are
significantly and positively correlated, and especially the higher level. This
result converges with the official statistics of the country (NSI and NEASE) and most international
organizations (UN, UNDP, OECD) .This feature, common once again to less
developed countries, is not very discriminatory in most developed countries. Again,
a significant gap in the analytical spirit emerges from research on developed
countries, which categorically diverge from those of the less developed
countries in the results, analyzes and reasons [32][33][34].
After the tested logit we do generalised
the analysis to our entire database and then we obtain practically the same
results:
Table3. Results of ordinal logit model
We find practically the same signs for the
most discriminating variables retained by the modeling test in ordered logit:
\(Yduration=-0.4644573Xsex+0.0162642XAge+0.2514821XSecondary-0.1187639XSecondary\Pr of+0.8138134XHigher+0.2625026XEastCentre-0.1444643XWestCenter-0.3900611XWestNorth-0.7119685XSouthEast+0.06523XSouthwest-0.8925981XGouvernorate-0.0389405XYearDiploma\)
Thus, to the significant variables retained
by the test sample, are added other significant and discriminating variables of
the duration, in particular the year of graduation and especially the regional
affiliation of the candidates (which is significant and negative). If we add
the diploma variable, which is not significant (problem of multi-collinearity
with the specialty and the level of schooling) year of graduation (which is
significant and negative), we will obtain this stable model for the ordinal
logit which converges with the result of the test sample.
2.3Relational schema
A decision support system (DSS)
in which we have included prior probability distributions [35] and the
SHARK (Search Hierarchic Association Rules for Knowledge) algorithm [36] is
applied. For a better interaction with the user who
has the task to direct the search, an easy-to-use interface has also been
developed. This tool is a necessary support to help the experts in their
analysis. Indeed, this software could be programmed
with the best filtered data and better representative algorithms but remain
sometimes unsuitable, despite of this, because of the difficulty and complexity
of the matter in hand. Therefore it is necessary to work bearing in mind that
the final application (the soft) should be in accordance with the expectations
and aspirations of users. We used a specific graphical interface to be able to program and
increment our algorithm rules relating to duration. The principle is based on
the anthropocentric approach.
Figure3. Data Mining Process and work flow
[8]
The guiding thread allows reducing the expertise times and the number of
association rules generated by this specific procedural method by including
the user-analyst at the heart of the Data Mining process. Hierarchical search is used to perform searches
level in order to quickly propose generalized rules to a user who will judge
their relevance. His choices will guide the process in the following levels to
specify the rules. In that way, the number of rules generated is smaller and
more targeted because the user guides the search from the beginning to the end
of the process. A set of algorithms has been developed to
try to meet these specifications.
For that goal we realised is a multipurpose tool: a simulator. It first
helps in calculating the average of unemployment duration for each subscripted
candidate, his probability of unemployment and mainly a decision- making tool
able to reduce the time wasted to get a job, and consequently increasing the
probability of leaving unemployment. The static database on which we focused
our efforts is covering the period 2005 till 2015 for individuals already
registered. We faced the classic problems of statistical censorship and
truncation, for which it was possible to overcome through
incremental updates to the system. This is one of the advantages of intelligent
self-learning systems. Self-learning
allows the simulator to take into account the instantaneous changes (numbers
and profiles) affecting the base (hence self-learning) and thus help to extract
realistic rates, durations and decisions that may affect unemployed persons. All
this required a transition to computer programming through the Access database
to correctly define the scripts and algorithms that may correctly establish and
express relations and operations dealing with set of data (or recommendations).These
methods allow us to simulate the processes of human reasoning (inference,
analogy and deduction) based on the available basic knowledge[37][38]. Another very important aspect of developing
intelligent systems is their ability to acquire new knowledge (sometimes from
several different sources) and evaluating them. Then, we tried
to answer to the major problem of research :”how to manage the individual
determinants of the duration of unemployment in Tunisia” based on hybrid
systems combining a database and a set of interconnected algorithms,
established under an intelligent and decision-making simulator combined with
the human expertise granted by the agents of NEASE. The information collected
through the NEASE
[9] network is then processed by a learning process. This treatment is then ensured
both by updates, and also correction of the gaps or inconsistencies thanks to
the use of a constructive network. A method of extraction described as
incremental rules were integrated into the system, together with knowledge
validation algorithms, marrying connectionist and symbolic modules [40]. For
all that, the simulator was designed to generate an automatic learning system
for Constructive (incremental) acquisition of knowledge [41][42]. All this laid the foundation of
the context in which our application on the tunisian labour market is made,
aiming to take in account ever-increasing number of applicants, the absence of
a personalized follow-up, precise and detailed ratios or indicators of this
market, the individual characteristics of the applicants, the diversity of
inadequate offers. The public intermediary is now given all the means necessary
to carry out its duties. Facing the obligation of playing perfectly
its role of consultant and find ways to determine matching and to control it. The
offices of this Agency have on their possession at this time an instant dashboard
to monitor the situation of the unemployed and are able to advise them, and
supervise the effectiveness of corrective measures of duration and hence of
unemployment .It is in this sense and for those goals, that the idea of a
simulator of the duration of unemployment was made.
Concretely, we will be able to calculate first of all, the instantaneous
duration of the unemployed enrollers via a simulator of the duration of
unemployment, taking into account the individual specifications and the rules
of association established through our initial database. We can then increment
any new registrant to automatically update the database and have accurate and
unbiased calculations. A last step, and not the least, we will present the
recommendations of this simulator to reduce the duration for each of the
registrants according to their personal variables and attributes. This will be
preceded by the integration of a data set or physical schema of the simulator.
Figure 4. Physical relational schema of the simulator
Finally we will have a decision support computer system (DSCS) for which
we proposed the following graphical interface.
Figure 5. Graphic Interface of the simulator
3 Primary
analysis of the unemployment duration and simulator implementation
3.1Geographical distribution of the
duration of unemployment
The primary analysis of the unemployment
duration at the level of the various governorates reveals an imbalanced distribution.
Far from the classic observation which attests that coastal areas (rich and
prosperous in theory) are not affected by unemployment and therefore by the
duration associated with it, we found on the contrary, confused and sometimes
contradictory relationship between theory and official statistics.
Figure 6. Distribution of unemployment duration in
Tunisia
It is already noted that the zones are classified into four groups:
• The most affected by a high duration of unemployment are the regions
of Tunis, Monastir, Mehdia, Sfax and Tozeur, where the average duration varies
between 130 and 142 days.
• Follow Bizerte, Sousse, Mannouba, Ariana, Sidi Bouzid and Gafsa with a
waiting time average between 123 to 129 days.
• Beja, Zaghouane, BenArous, Nabeul, Silena, Kasserine and Kébili
governorates lasting between 113 and 123 days.
• Follow Jendouba, Kef, Kairouan, Tataouine, Medenine and Gabes for an
average time between 96 and 112 days.
Thus the existence of this macroeconomic imbalance in the distribution
of duration already determines the framework within which the microeconomic
analysis that we are about to realize through the simulator.
3.2Activation of the simulator and its
implementation
The association of the various criteria (and / or variables) of our
initial database allows us to have a double result: first, the instantaneous
individual duration of each candidate according to the governorate of
belonging, gender, marital status, age, diploma, and specialty. Then, the simulator determines the number
of cases concerned, by such criteria in terms of population registered in the
employment offices. Thus, by simulating the average duration of the population
of registered unemployed male for example, without assigning precise selection
criteria, we obtain an average duration of 108 days for 201,835 individuals.
Figure 6. Average of duration and number of cases
After that if we took randomly the
governorate of Ariana as affiliation criteria, for example, in addition to male
candidate and unmarried marital status, the average unemployment duration of
132.24 days is obtained, for 6407 registered unemployed.
Figure 7. Gender and residence affiliation
This first overview of the simulated duration of unemployment is
followed by a series of recommendations suggested by the simulator to reduce
this duration and to be able to improve the probability of unemployed people
leaving unemployment. The aim is to reduce this duration of unemployment and thus
to be able to retroact (active employment policies).
The simulator presents, first, the recommendations on geographical
mobility with particular interest to the areas to which the unemployed belong.
An increasing chronological classification of the duration, the districts of
the same area but also the closest geographically area is then illustrated.
Figure 8. Duration for the same candidate profile at
the national level
In addition to that, a second set of recommendations is proposed by the
simulator according to the study specialties for the jobless whom waiting times
are the longest. Their probability of leaving unemployment is then the highest.
The simulator gave a particular interest to the area to which the unemployed
belong, to propose to them the district that may match the most to their
profiles, their specialties and their vocational training ( smallest duration).
Figure 9. Chronological recommendation by speciality
From there, the two sets of recommendations
can be combined to finally recommend the vocating training and districts, in line
with the candidate's attributes or determinants.
Figure 10. Spatial recommendation of the simulator
We are then facing a new tool to assist decision-making, able to help
labour market intermediaries and in particular the NEASE agency, to fulfil
their role as consultant and advisor. As much as the number of criteria associated
to the candidate increases, more efficient and accurate the calculations and
suggested measures becomes.
The role that has been assigned to NEASE agencies till nowadays, limited
to a simple data collector, neutral most of the time, and incapable of making
decisions or advice jobseekers, has now all the chances and opportunities to change
ensuring the real function for which it
was created .This is the advantage of this innovative tool for the intermediary.
The simulator of unemployment duration should make possible for the NEASE
offices to have an instant dashboard to monitor the situation of the unemployed
and to be able to advise unemployed and to supervise the correctives measures
for efficiency and by the same way to contribute to reduce unemployment. This
application will provide job seekers and companies the means to be served by
any employment office in Tunisia. The centralized database is the result of centralising
all data bases of actors of job market, which are instantly updated. This dashboard
is representative of all evolutions and the related calculations are fairly
accurate and credible. Individual information on demand and offer, as well as another
indicator are provided by representations of NEASE.
In fact, the simulator provides the managers of the employment database
the ability to increment it at any time from any office thanks to password and
a login belonging to the accredited agents.
Figure 11. Management of Database
4 Conclusion
This new computerized decision-making tool provides consequently a
source of both personal and global information for whom in charge of managing
the unemployment problem, the NEASE agencies in particular. Indeed, the
duration of unemployment is indicative of the length of episodes of
unemployment for a candidate with a profile and a set of particular
characteristics and attributes. Thus, taking into account the variables of the
individuals registered at the employment offices and applying the associative
algorithms established by the simulator, it is possible to determine precisely
for a particular profile the exact duration, number of cases involved etc.
In a second step, the simulator establishes a national scale at the
level of the 7 areas, 24 districts and 256 delegations, the duration of unemployment
with distinctive criteria.
In a final step, the simulator will fully play its decision-making role by
proposing a shorter duration for a geographically adjacent (spatially
contiguous) district belonging to the overall area of the candidates. At the
same time, the simulator also offers for diplomas and study specialties in
chronological order for duration, both to allow having an idea about the
training courses most requested and recommended spatial weakest duration
associated.
This should make it possible to in better advices for candidates, in
search of employment associated to their migratory desire correlated to the vocating
training they have for a better probability of leaving unemployment. This
should enable policy-makers, on the whole,
to determine the training that
generates jobs, and which should be reduced or eliminated.
At the same time, this should enable us to achieve two objectives: one
at the individual level (exit from unemployment) and the other at a more
macroeconomic and even strategic level (in better matching between job offer
and demand).Also, this practical and innovative tool also has the advantage of
being incremental, meaning that any new registration will be automatically
counted and the calculation of the duration will be completely up to date.
The management of individual determinants of the duration of unemployment
will now be implemented by NEASE and its agents. They can thus have at any time
and in any office, the situation of all the candidates according to their various
characteristics or attributes. In light of the information gathered, they can inform, guide, advise and
analyze thereafter at real-time and insure an efficient management.
Reference
[1] Autor, D. H. (2001). Wiring the labor market. The Journal of Economic
Perspectives, 15(1), 25-40.
[2] Bessy, C., & Chauvin, P. M. (2013). The power of market
intermediaries: From information to valuation processes. Valuation
studies, 1(1), 83-117.
[3] Autor, D. (2008). The economics of labor market intermediation: An
analytic framework.
[4] Mortensen, D. T., & Pissarides, C. A. (1999). New developments in
models of search in the labor market. Handbook of labor economics, 3,
2567-2627.
[5] George J.Stigler :Information in the Labor Market, Journal of
Political Economy.Vol. 70, No. 5, Part 2: Investment in Human Beings
(1962),94-105, Published by
The
University of Chicago Press.
[6] Miller, M. H., & Rock, K.
(1985). Dividend policy under asymmetric information. The Journal
of finance, 40(4), 1031-1051.
[7] Spence, M. (1973). Job market signaling. The quarterly journal
of Economics, 87(3), 355-374.
[8] Guillemette de Larquier : Principes des marchés
régis par appariement, Revue économique.Vol. 48, No.
6,1409-1438,(1997),Published by:
Sciences Po University Press.
[9] Jovanovic, B. (1979). Job matching and the theory of turnover. Journal
of political economy, 87(5, Part 1), 972-990
[10] Bleakley, H., & Fuhrer, J. C. (1997). Shifts in the Beveridge curve,
job matching, and labor market dynamics. New England Economic Review, 3.
[11] Noe, R. A., Hollenbeck, J. R., Gerhart, B., & Wright, P. M. (2006). Human
resource management: Gaining a competitive advantage.
[12] Bizer, C., Heese, R., Mochol, M., Oldakowski, R., Tolksdorf, R., &
Eckstein, R. (2005). The impact of semantic web technologies on job recruitment
processes. In Wirtschaftsinformatik 2005 (pp. 1367-1381).
Physica-Verlag HD.
[13] Rumberger, R. W., & Levin, H. M. (1985). Forecasting the impact of
new technologies on the future job market. Technological Forecasting and
Social Change, 27(4), 399-417.
[14] HOUTSMA, Maurice et SWAMI, Arun. Set-oriented data mining in relational
databases. Data & Knowledge Engineering, vol. 17, no 3, (1995),
245-262.
[15] KODRATOFF, Yves. Applications de l'apprentissage automatique et de la
fouille de données. In : EGC.(2001),57-68.
[16] Piateski, G., & Frawley, W. (1991). Knowledge discovery in
databases. MIT press.
[17] Brin, S., Motwani, R., Ullman, J. D., & Tsur, S. (1997, June).
Dynamic itemset counting and implication rules for market basket data. In ACM
SIGMOD Record (Vol. 26, No. 2, pp. 255-264). ACM.
[18] Li, L., & Zhang, M. (2011, July). The strategy of mining association
rule based on cloud computing. In Business Computing and Global Informatization
(BCGIN), 2011 International Conference on (pp. 475-478). IEEE.
[19] Leung, M. D. (2014). Dilettante or renaissance person? How the order of
job experiences affects hiring in an external labor market. American
Sociological Review, 79(1), 136-158.
[20] Poterba, J. M., & Summers, L. H. (1995). Unemployment benefits and
labor market transitions: A multinomial logit model with errors in
classification. The Review of Economics and Statistics, 207-216.
[21] WILLIAMSON, Oliver E. Comparative economic organization: The analysis of
discrete structural alternatives. Administrative science quarterly,
(1991), 269-296.
[22] Elton, M. D., & Book, W. J. (2010). Operator efficiency improvements
from novel human-machine interfaces. Georgia Institute of Technology.
[23] Takagi, H. (2001). Interactive evolutionary computation: Fusion of the
capabilities of EC optimization and human evaluation. Proceedings of the
IEEE, 89(9), 1275-1296.
[24] Abdi, H., & Valentin, D. (2007). Multiple correspondence
analysis. Encyclopedia of measurement and statistics, 651-657.
[25] Pindyck, R. S., & Rubinfeld, D. L. (1991). Econometric models. Economic
Forecasts, 3.
[26] McFadden, D. L. (1984). Econometric analysis of qualitative response
models. Handbook of econometrics, 2, 1395-1457.
[27] Gujarati, D. N. (2009). Basic econometrics. Tata McGraw-Hill
Education.
[28] Phillips, P. C., & Sul, D. (2007). Transition modeling and
econometric convergence tests. Econometrica, 75(6), 1771-1855.
[29] Daniel MacFadden « Conditional
Logit Analysis of Qualitative Choice Behavior » 1974- Major curse At University
of California - Barkeley
[30] Flinn C, Heckman J. (1983). "Models of the Analysis of Labor Force
Dynamics", in Advances in Econometrics, vol.I, R. Basmann et G. Rhodes
éditeurs, JAI Press, Greenwich, pp. 35-95.
[31] Mark C.Foley de l’université de Yale « Determinants of unemployment
duration in Russia » Aout 1997- Economic Growth Center
[32] Erna Kahraman Phd june 2011 ˝Youth Employment and Unemployment in
Developing Country : Macro challenges with Micro Perspectives ˝.
[33] McCormick B. (1990). "A Theory of Signalling During Job Search,
Employment Efficiency, and Stigmatised Jobs" , The Review of Economic
Studies, vol. 57, pp. 299-313.
[34] Maki D.R., Spindler Z.A. (1975). "The Effect of Unemployment
Compensation on the Rate of Unemployment in Great-Britain", Oxford
Economic Papers, vol. 27, n°3,pp. 440-454, novembre.
[35] Paul Gregg,Jonathan Wadsworth : How effective are state employment
agencies ? Jobcentre use and job matching in Britain.Oxford Bulletin of
Economics and Statistics,
Volume
58, Issue 3, 443–467, (1996).
[36] AGRAWAL, Rakesh, IMIELINSKI, Tomasz, et SWAMI, Arun :Database
mining: A performance perspective. IEEE transactions on knowledge and data
engineering, (1993), vol. 5, no 6, 914-925.
[37] GREENBERG,Jerald et BARON,RobertA:Behavior in organizations: Understanding
and managing the human side of work ,2003,Pearson College Division.
[38] DIAPER. Dan : Understanding
task analysis for human-computer interaction. The handbook of task
analysis for human-computer interaction, (2004), 5-47.
[39] HAN, Jiawei et KAMBER, Micheline : Data mining: concepts and
technologies, Models Methods & Algorithms, (2001), vol. 5, no 4,1-18.
[40] SRIKANT, Ramakrishnan et AGRAWAL, Rakesh. Mining sequential patterns:
Generalizations and performance improvements. In : International
Conference on Extending Database Technology. Springer Berlin
Heidelberg,(1996),1-17.
[41] HOUTSMA, Maurice et SWAMI, Arun :Set-oriented data mining in
relational databases. Data & Knowledge Engineering, (1995), vol. 17,
no 3, 245-262.
[42] DIAPER, Dan et SANGER, Colston : Tasks for and tasks in
human–computer interaction. Interacting with Computers, (2006), vol. 18,
no 1, 117-138.
Declarations
· Ethics approval and consent to participate
'Not
applicable'
· Consent
for publication
'Not
applicable'
· List of abbreviations
NEASE = National
Employment Agency and Self employment.
IT = Information
technologies.
CSDM = Computer system
for decision-making.
KDD =Knowledge
Discovery in Databases.
DMP = Data Mining
process.
HMI =
human-machine interaction.
NIS/NSI =
National Institute of statistics of Tunisia.
MCA = Multiple
Factor Component Modelling.
ML = Multinomial
Logit.
OL = Ordred
Logit.
DSS = Decision
Support System.
SHARK = Search
Hierarchic Association Rules for Knowledge.
· Availability of data and material
The data of the unemployment duration of
individuals in Tunisia was sent to me by Mr Mohamed Fadhel Berhouma Director Of Information Systems and managment of
Method on National Employment Agency and Self employment in Tunisia.
Adress :19 Rue Asdrubal 1002 Tunis - Tunisie
Tel.: +216.71.781.200 / Fax
: +216.71.783.236
· Competing
interests
‘No competing interests by this reserch’
· Funding
‘No funding used for this reserch’
· Authors'
contributions
To answer the main problem of this research paper (management of the
individual determinants of the duration of unemployment) i have tryed to
explain:
ü Intermediation and employment policies
ü The
realization of a software of calculation of duration of unemployment
(Simulator)
ü The
redefinition of the role of the intermediary in Tunisia
The main contributions could be summarized in 5 points:
• A new computerized decision support tool
providing information for both registered unemployed persons and organizations
responsible for managing the unemployment problem, in particular the NEASE .
• Using this new soft, it would be possible
to determine for a particular profile of candidates the exact duration and
number of cases involved. And on a macroeconomic scale, the duration of
unemployment for the same profile of candidates nationwide.
• It also offers diplomas and study
specialties, the duration of which is arranged in chronological order so that one
can get an idea of the most requested and lasting diploma courses. This
should make it possible to better advise candidates in search of employment in
order to associate their migratory desire with the training they have for a
better probability of leaving unemployment.
Ø This makes it possible for policy-makers to
determine the training that generates employment and which should be reduced or
eliminated.
• This tool is incremental, so any new
enrollment will be counted automatically and the calculation of the duration
will be updated.
• Management of the individual determinants
of the duration of unemployment will now be implemented by NEASE and its regional offices. They
can thus have, at any time and in any office, the detailed situation of all the
candidates, according to their various characteristics or attributes. They can
therefore inform, guide, advise and analyze. They will then play the main role
for which they were created with a more effective and instantaneous management
of the registered unemployed.
· Acknowledgements
- Mr. Ricco Rakatomalala, lecturer at the
University of Lyon 2.
- Mr. François Husson, Professor at
Agrocampus-Ouest and Director of the IRMAR laboratory.
- Mr. William Greene of the University of
New York Stern School of Business.
- Mr. Adel Ben Rhouma, Director of IT, at
the National Agency for Employment and Independent Labor.
· Authors'
information (optional)
Anis Ben Ahmed Lachiheb
Adress : Résidence
les 3 Palmiers –
Avenue Abdlahmid Essaka-
Bouhsina- Sousse (4000) (TUNISIA)
Tel : (00216)53918558-20504900
Affiliation
: Economics, Management and Quantitative Finance
Laboratory (LaREMFiQ)
IHEC Sousse (TUNISIA)
Adress : Route Hzamia Sahloul 3 –
BP n° 40 - 4054
Sousse - 4054 sousse
Tel : +216 73 368 351 - 73 368
358
Fax : +216 73 368 350
Corresponding author : Anis Ben Ahmed Lachiheb
Classification JEL : R 23, R 19, R 38
[1] Information
technologies
[2] Computer system for decision-making
[3] Knowledge Discovery in Databases
[4] Knowledge Discovery in databases
[5] Hegland, M. (2001). Data mining techniques.
Acta Numerica 2001,
10,
313-355.
[6] According to
The National Institute of statistics of Tunisia and to ILO recommendations
[7] Olsen R.,
Smith D., Farkas G. (1986). "Structural and Reduced Form Models of Choice
among Alternatives in Continuous Time: Youth Employment under a Guaranteed Jobs
Program", Econometrica, vol. 54, pp. 375-394.
Flinn C, Heckman J. (1983). "Models of the Analysis of Labor Force
Dynamics", in Advances in Econometrics, vol. I, R. Basmann et G. Rhodes
éditeurs, JAI Press, Greenwich, pp. 35-95.
[9] Tunisian agency for public job intermediation